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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

10 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, fer 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
3 0 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
3 5 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1-1786 and 3573-5358. The polypeptides sequences are 
designated SEQ ID NO: 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 
in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 
cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 
1 0 the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-1786 and 3573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 
15 specific domain or truncation of the peptides encoded by SEQ ID NO:l -1786 and 3573-5358 . A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequence of SEQ IDNO:l-1786 and 3573-5358 or a degenerate variant or fragment thereof. The 
identifying sequence can be 100 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
20 from the nucleic acid sequences of SEQ ID NO: 1 -1 786 and 3573-5358 . The sequence information 
can be a segment of any one of SEQ ID NO:l -1 786 and 3573-5358 that uniquely identifies or 
represents the sequence information of SEQ ID NO: 1-1786 and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
25 a nucleic acid array. In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
3 0 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or then- 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readable media, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DN A or RN A, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1 - 1 786 and 3 573- 
5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 
5 expression assays that are well known in the art. In a particularly preferred embodiment, the nucleic 
acid sequences of SEQ ID NO: 1 - 1 786 and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et al., Science 258:52-59 (1 992), as expressed sequence tags for 
physical mapping of the human genome. 
1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO:l-1786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO: 1 -1 786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO: 1-1 786 and 3573-5358. The polynucleotides of the 
1 5 present invention also include, but are not limited to, a polynucleotide that hybridizes under 

stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO:l-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
20 (e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 

polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 
25 full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO: 1-1 786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 
30 equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 

3 
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The invention also provides compositions comprising a polypeptide of the invention. 

Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 

hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 

5 the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 

10 protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 

15 or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 

20 expressed sequence tags for identifying expressed genes or, as well known in the art and 

exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 

25 of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 

30 which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 

3 5 expression or biological activity. 
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The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for example,. be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
1 0 invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
1 5 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
30 identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
3 5 modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 

activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
5 polynucleotides to which they have homology (set forth in Table 2); for which they have a 

signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are usefulibr a variety of applications, as described 
herein, including use in arrays for detection. 

10 

4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
1 5 "an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
20 Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
25 enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5*-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
30 total complementarity exists between the single stranded molecules. The degree of 

complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
35 stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
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and continuous source of germ cells for the production of gametes. The term "primordial germ 
cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
5 are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 
1 0 As used herein, a sequence is said to "modulate the expression of an operably linked 

sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs axe nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 
1 5 The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
25 acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

the terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
30 more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
35 preferably from about 1 5 to about 50 nucleotides, more preferably from about 1 7 to 30 



20 
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nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 

be used in polymerase chain reaction (PCR), various hybridization procedures or microarray • 

procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 

fragment or segment may uniquely identify each polynucleotide sequence of the present 

5 invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 

IDNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 

10 be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 

15 entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO:l-1786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO: 1-1 786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO:l- 

20 1786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 

25 matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 

30 be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 

with a single mismatch is calculated by multiplying the probability for a full match (l-s-4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 

3 5 detected in a human genome is approximately one in five. 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
1 5 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 1 7 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
20 length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 
length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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Hie term "derivative" refers to polypeptides chemically modified by such techniques as 

ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 

attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 

substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 

5 in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 

occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 

recombinant DNA techniques. Guidance in determining which amino acid residues may be 

replaced, added or deleted without abolishing activities of interest, may be found by comparing 

1 0 the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 

1 5 substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 

20 affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 

25 nature of die residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutarnine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 

30 "deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 

amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 
Alternatively, where alteration of function is desired, insertions, deletions or 

35 non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 
at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., £. colU will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
1 5 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 
20 "Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. u Secreted }t proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P.A. and 
Young, P.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 
25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

3 5 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 
sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
35% (i.e., the number of individual residue substitutions, additions, and/or deletions in a 
substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g. , mutant, amino acid 
sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 
preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 
Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
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term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
5 which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
10 determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

1 5 4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO: 1 787-3572 and 5359-7144; and a polynucleotide 

20 comprising the nucleotide sequence encoding the mature protein coding sequence of the 

polypeptides of any one of SEQ ID NO:1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO:l- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 

25 set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 

polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO: 1787-3572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 

30 receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
irnmunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 

14 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

5 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3 r sequence can 

1 0 be obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO:l-1786 and 3573-5358 can be obtained 
by screening appropriate cDN A or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO: 1 -1 786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO:l-1786 and 3573-5358 may be used as the 

1 5 basis for suitable primer(s) that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO:l-1786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 

15 
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the same family of genes or can differentiate human genes from genes of other species, and are 
preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences,but also include allelic and species variations thereof. Allelic and species 
5 variations can be routinely determined by comparingthe sequence provided SEQ ID NO: 1-1786 
and 3573-5358, a representative fragment thereof, or a nucleotide sequence at least 90% identical, 
preferably 95% identical, to SEQ IDNO:l-1786 and 3573-5358 with a sequencefrom another 
isolateof the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed 
10 herein. In other words, in the coding region of an ORE, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO.1-1786 and 3573-5358, can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 
1 5 is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

• The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
35 will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 

may be made at the target site. Amino acid sequence deletions generally range from about 1 to 

30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 

5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 

residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 

sequences necessary for secretion or for intracellular targeting in different host cells and 

10 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

15 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et aL, 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al, supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

17 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 

to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 

domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 

5 polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 

synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 

to those of skill in the art and can include, for example, methods for determining hybridization 

conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 

10 protein coding sequences corresponding to any one of SEQ ID NO: 1-1 786 and 3573-5358, or 

functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 

the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 

Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 

15 nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 

(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 

nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 

plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 

art. Accordingly, the invention also provides a vector including a polynucleotide of the 

20 invention and a host cell containing the polynucleotide. In general, the vector contains an origin 

of replication functional in at least one organism, convenient restriction endonuclease sites, and a 

selectable marker for the host cell. Vectors according to the invention include expression 

vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 

according to the invention can be a prokaryotic or eukary otic cell and can be a unicellular 

25 organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 

having any of the nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 or a fragment 

thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 

constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 

30 which a nucleic acid having any of the nucleotide sequences of SEQ ID NO:l-1786 and 3573- 

5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 

comprising one of the OREs of the present invention, the vector may further comprise regulatory 

sequences, including for example, a promoter, operably linked to the ORE. Large numbers of 

suitable vectors and promoters are known to those of skill in the art and are commercially 

35 available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 

pBs KS, pNH8a, pNH16a, pNHlSa, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, 

pDR540, pRITS (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 

pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al, 

Nucleic Acids Res. 1 9, 4485-4490 (1991), in order to produce the protein recombinantly. Many 

suitable expression control sequences are known in the art. General methods of expressing 

recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

10 Enzymology 1 85, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

15 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli 
and S. cerevisioe TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino tenrimal identification peptide imparting desired 

30 characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification v^thin the host. Suitable prokaryotic hosts for 
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transformation include E. colt, Bacillus subtilis, Salmonella typhimwium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector P BR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
P BR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et aL, Nat. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 

4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ IDNO:l-1786 and 3573-5358, or fragments, analogs or derivatives thereof. 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 1 00, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO:1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ IDNO:l-1786 and 3573-5358 are additionally provided. 
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In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3 f sequences which flank the coding region that are not 
translated into amino acids (/.<?., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO: 1-1 786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region ofamRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 
15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethyIaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluraciI, dihydrouracil, beta-D-galactosylqueosine, 
inosine,N6~isopentenyladenine, 1 -methy Iguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5^methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-truouraciI, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-.2~carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the 
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inserted nucleic acid will be of an antisense orientation to a target nucleic acid ot interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
10 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g. , 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
1 5 receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
20 cc~anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, the 
strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 1 5: 6625-6641 ). The 
antisense nucleic acid molecule can also comprise a 2 , -o-methylribonucleotide (Inoue et al. 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 
25 FEBSLett 21 5: 327-330). 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
30 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO:l- 
35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et al. U.S. Pat. 
No. 4,987,071; and Cech et al. U.S. Pat. No. 5,1 16,742. Alternatively, SECX mRNA can be 
used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 
5 molecules. See, ^.,Bartel et al., (1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. eta/. (1992) Ann. N.Y.Acad. Sci. 660:27-36; and 
1 0 Maher (1 992) Bioassays 1 4: 807-1 5. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al. (1996) Bioorg Med 
1 5 Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
20 standard solid phase peptide synthesis protocols as described in Hyrup et al. ( 1 996) above; 
Perry-O'Keefe et al. (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g. , inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et al. ( 1 996), above; Perry-O'Keefe ( 1 996), 
above). 

30 In another embodiment, PNAs of the invention can be modified, e.g. , to enhance their 

stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 

35 enzymes, e.g. , RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et aL (1996) Nucl Acids Res 24: 
5 3357-63. For example, a DNA chain can be synthesized on a solid support using standard 
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 
5»,(4- r nethoxytrityl)amino-5 , -deoxy-thymidine phosphoramidite, can be used between the PNA 
and the 5' end of DNA (Mag et aL (1989) Nucl Acid Res 1 7: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 
10 DNA segment'(Finn et aL (1996) above). Alternatively, chimeric molecules can be synthesized 
with a 5' DNA segment and a 3' PNA segment. See, Petersen et aL (1975) Bioorg Med Chem 
LettS: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

15 cell membrane (see, e.g., Letsinger et aL, 1989, Proa Natl Acad. Set U.S.A. 86:6553-6556; 

Lemaitre et aL, 1987, Proc. Natl, Acad Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
aL, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 

20 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

35 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et ah, Basic Methods in Molecular Biology (1986)). The host cells containing one of the 
15 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMR 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa ceils, Cv-1 cell, 
20 COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells arc those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
25 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 

expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, ct 
aL, in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
30 protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other ceil lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS ceils, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
35 from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
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HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 

5 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protem 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 

10 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 

1 5 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 

20 may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 

may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
5 enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
10 sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
1 5 selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
20 phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et ah; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et ah; and International Application No. 
25 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
30 comprising: the amino acid sequences set forth as any one of SEQ ID NO:1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:-l- 
1 786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
35 NO:l-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 
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set forth as SEQ ID NO.1787-3572 and 5359-7144 or (c) polynucleotides that hybndize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 
amino acid sequences set forth as SEQ ID NO:1787-3572 and 5359-7144 or the corresponding 
5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 
-65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 
about 90%, typically at least about 95%, more typically at least about 98%, or most typically at 
least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 
allelic variants may have a similar, increased, or decreased activity compared to polypeptides 
10 comprising SEQ ID NO:1787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et a!., J. Amer. 
15 Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention {e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the ait can be utilized to obtain any one of the 

isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid- 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

1 5 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratoiy 

35 Manual, Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 

10 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787~3572 and 5359-7144. 

1 5 The protein of the invention may also be expressed as a product of transgenic animals, 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 

20 deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 

made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 

25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 

30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 

35 retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or iminunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted.. 
Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
1 0 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et aL, Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 21 5:403-410 (1990), PSI-BLAST (Altschul S.F. et aL, Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 
25 Biol., Vol. 6, pp. 2 1 9-23 5 (1 999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 
30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 

correspond to all or a portion of a protein according to the invention. In one embodiment, a 

fusion protein comprises at least one biologically active portion of a protein according to the 

invention. In another embodiment, a fusion protein comprises at least two biologically active 

5 portions of a protein according to the invention. Within the fusion protein, the term "operativcly 

linked" is intended to indicate that the polypeptide according to the invention and the other 

polypeptide are fused in- frame to each other. The polypeptide can be fused to the N-terminus or 

C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

10 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

15 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the inununoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g., cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligaads, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in- frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 
1 0 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 
1 5 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 
20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 
25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 
30 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
Publication No. WO 94/12650, PCT International Publication No. WO 92/20808, and PCT 
International Publication No. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
5 property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 

1 0 xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
U.S. Patent No. 5,578,461 to Sherwin et al.; International ApplicationNo. PCTAJS92/09627 
(WO93/09222) by Selden et al.; and International ApplicationNo. PCT/US90/06436 

15 (W09 1/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

10 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

15 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

20 Publication No. W094/28 1 22, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 
1 0 indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
15 or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 
20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

conesponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 
30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 
35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
5 receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 
10 Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. R Fritsch 
1 5 and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4-10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acid 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
25 polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 



4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

30 A polypeptide of the present invention may exhibit activity relating to cytokine, cell 

proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 

35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8,RB5, DAI, 123, T1165, HT2 ? CTLL2, TF-1, Mo7e, CMK, 
5 HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3 .1-3.19; Chapter 7, Immunologic studies in 
10 Humans); Takai et ah, J. Immunol. 137:3494-3500, 1 986; Bertagnolli et al., J. Immunol. 

145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al., I. Immunol. 149:3778-3783, 1992; Bowman et aL, I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
15 Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 

eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
20 include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
25 and human interleukin 6— Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11-Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15. 1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
30 9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
35 Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
19S0; Weinberger et al., Eur. J. Immun. 1 1:405-41 1, 1981; Takai et al., J. Immunol. 
5 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
1 0 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 

germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
1 5 large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 

25 3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
autocrine expression of the polypeptide of the invention. This will allow for generation of 
undifferentiated totipotenual/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 
identification of differentially expressed genes in stem cell populations that regulate stem cell 

proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 
used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addihon, 
the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
cell type from undifferentiated stem cell populations involves the use of a cell-type specific 
promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 
30 Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 
35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et ah, Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

15 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid ceils such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: . 
Methylcellulose colony forming assays, Freshney, M. O. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
5 Proc Natl Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I- Freshney, et al. eds. Vol pp. 1-21, 
10 Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I- 
Freshney et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 
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4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of burns, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also » 

useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blockmg 

inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 

mediated by inflammatory processes may also be possible using the composition of the 

invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 

10 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 
A composition of the present invention may also be useful for gut protection or 
5 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 
1 5 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 
25 severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 
30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 
3 5 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
5 reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
10 (particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enliancement test (Lastbom et aL, Toxicology 125: 59-66, 
15 1998), skin prick test (Hoffmann et aL, Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et aL, Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et aL, 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
20 immune response already in progress or may involve preventing the induction of an immune 

response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
25 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
30 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
35 followed by an immune reaction that destroys the transplant. The adrriinistration of a therapeutic 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
5 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
1 0 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1992) and Turka et aL, Proc. Natl. Acad. Sci USA, 89: 1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
1 5 compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
20 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
25 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
30 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
35 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
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Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class H molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

15 MHC class I alpha chain protein and 02 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DN A encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 • Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61 :1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 
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Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
5 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
10 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
1 5 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 1 82:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
20 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
25 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:111-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
5 release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
10 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
1 5 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al., Nature 
20 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
3 5 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
10 M. Kxuisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and 'Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

15 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

30 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of the invention include: Actinomycin D, Anunoglutethimide, 

53 



0153312A1_I_> 



PCT/US00/34263 

WO 01/53312 

Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (V16-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Tbioguanine, Thiotepa, Vinblastine sulfate, Yincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 
1 0 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically . 
15 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 
20 tumor systems in nude mice as described in GiovaneUa et al., J. Natl. Can. Inst., 52: 92 1-30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 
25 Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their, ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
1 0 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et aL, Proc. 
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et ah, J. Exp. Med. 168:1 145-1 156, 1988; 
Rosenstein et ah, J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
15 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 
overlay assays, or other methods known in the art. . 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 
35 nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line, 
5 which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or rnimetic peptides, oligonucleotides or organic molecules. 
1 0 Chemical libraries may be readily synthesized or purchased from a number of 

commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads 57 via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
1 5 screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 282:63-68 (1 998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
20 organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidormmetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
25 Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et aL, Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., BioorgMed Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
30 polypeptide of the invention. The molecules identified in the binding assay are then tested for 

antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
35 cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

5 4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

10 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 

15 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host celL The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identity signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 
endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 
limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et aL, 1985, Medicine, 2d Ed., LB. Lippincott Co., Philadelphia). 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
10 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

15 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

30 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 
35 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (hi) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 
10 forth in Arakawa et aL (1990, J. Neurosci. 10:3507-351 5); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
1 5 assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 
20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 
25 (Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
30 including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
35 subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency -related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
10 in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 



4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
15 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which ail 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
) arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et d, 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
1 5 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 1 4, 1 5, 1 8, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

25 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
5 condition and response of the individual patient. Typically, the amount of polypeptide 

administered per dose will be in the range of about 0.01fig/kg to 100 mg/kg of body weight, with 
the preferred dose being about 0.1 ug/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
10 and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

15 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 

20 to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 

25 effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, TNF, IL-1, 1L-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-1 1, IL-12, 
IL-13, IL-14, IL-15, IFN, TNF0, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

30 factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), transforming growth factors (TGF-a and TGF-P), insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
5 invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
10 IL-IRa, IL-1 Hyl, IL~1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 
1 5 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences/ 1 Mack Publishing Co., Easton, PA, latest 
20 edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

35 administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

10 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

15 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinicians provide maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutical*/. These pharmaceutical compositions may be 
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manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
5 suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
10 may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
1 5 added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
20 lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 

optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
25 tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
30 other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 

providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
adrninistration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
35 injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
5 the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 

10 dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 

1 5 retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 

20 materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 

25 of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 

polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 :1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 

30 without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 

co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 

35 hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

10 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such • 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
5 herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 

10 ingredient of the present invention with which to treat each individual patient. Initially, the 

attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 

15 various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 u.g to about 1 00 mg (preferably about 0.1 ug to about 10 nig, more preferably 
about 0.1 jig to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 

20 topically, systematically, or locally as an implant or device. When administered, the therapeutic 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 

25 active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 

30 cartilage damage, providing a structure for the developing bone and cartilage and optimally 

capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 

35 compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
10 biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

15 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-oc and TGF-P), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protem-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
5 with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

1 0 Polynucleotides of the present invention can also be used for gene therapy. Such 

polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 

1 5 proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 

20 compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 

25 the method of the invention, the therapeutically effective dose can be estimated initially from 

appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC50 as determined in cell culture (i.e., the concentration of 

30 the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 

35 cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 
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population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 50 and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
5 range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED 50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in "The 
10 Pharmacological Basis of Therapeutics", Ch. 1 p.L Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
15 bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be detennined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local, 
administration or selective uptake, the effective local concentration of the drug may not be 
20 related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 pg/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0. 1 jig/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
25 intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

30 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody" as used herein refers to irnmunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F ab > F ab » and F (ab ')2 
10 fragments, and an F ab expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgG,, IgG 2 , and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
1 5 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an arnino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 
25 Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 1 5 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 
35 may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proa Nat Acad ScL USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Mol Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
5 fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
irnmunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
1 0 monoclonal antibodies directed against a protein of the invention, or against derivatives, » 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

1 5 5.13,1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 

20 protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

recombinant^ expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 

25 adjuvant. Various adjuvants used to increase the immunological response include, but are not 

limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 

30 adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 

35 fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
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target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 

10 gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 

15 described by Kohler and Milstein, Nature, 256:495 (1975). In ahybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 

20 protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or }ymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 

25 transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 

30 the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 

35 can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol., 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications, Marcel Dekker, Inc., New York, (1987) pp. 
5 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
10 enzyme-linked irrimunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem., 107:220 (1 980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

15 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
• include, for example, Dulbecco's Modified Eagle's Medium and RPML-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 

example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 

25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 

30 myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368, 
812-13 (1994))orby covalently joining to the immunoglobulin coding sequence all or part of the 

35 coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 



5 5.13,2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 

1 0 immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , F(ab*)2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et aL, 
Nature, 321 :522-525 (1986); Riechmann et aL, Nature , 332:323-327 (1988); Verhoeyen et aL, 

15 Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 

20 humanized antibody will comprise substantially all of at least one, and typically two, variable 

domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 

25 immunoglobulin (Jones et aL, 1986; Riechmann et aL, 1988; and Presta, Curr. Op. Struct. Biol.. 
2:593-596 (1992)). 



5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
30 sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et aL, 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et aL, 1985 In: Monoclonal 
3 5 Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J. Mol. Biol.. 227:381 (1991); 
Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technolog y 10, 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368. 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-5 1 (1 996)); Neuberger (Nature Biotechnology 14, 826 (1 996)); and 
Lonberg and Huszar flntem. Rev. Immunol. 13 65-93 (1995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 
preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of atranscript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771 . It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

5.13.4 F ab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F ab expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal F,* fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F w 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F ab fragment generated 
by reducing the disulfide bridges of an F (ab0 2 fragment; (iii) an F ab fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
5 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature. 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture often different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
10 chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et a/., 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
1 5 the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
20 ah, Methods in Bnzvmology. 121:210 (1986) 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
25 chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 
30 Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 

F(ab') 2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et ah, Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
35 fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab '-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab' -TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et aL, J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab') 2 molecule. Each Fab' fragment 
10 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 
1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny etaL. J. Immunol. 148(5): 1547- 1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
20 also be utilized for the production of antibody homodimers. The "diabody" technology 

described by Hollinger et aL, Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (V H ) connected to a light-chain variable domain (V L ) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
25 the V H and V L domains of one fragment are forced to pair with the complementary V L and V H 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et aL, J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
30 antibodies can be prepared. Tutt et aL, J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
35 IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRm (CD16) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOT A, or TETA. Another bispecific antibody of interest 
5 binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heterocon jugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 

10 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 

mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191-1 195 (1992) 
25 and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

30 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
35 radioconjugate). 
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Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzyrnatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
5 Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
2,2 Bi, ,3I I ) ,3, In, 90 Y 3 and 186 Re. 

10 Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 

protein-coupling agents such as N-succinimidyl-3-(2~pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 

15 bis-(p-diazoniumbenzoyl>ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitxobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
Carbon- 14-labeled l-isothiocyanatobenzyl-3-methyldiethyIene triaminepentaacetic acid (MX- 
DTP A) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 

20 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 

25 conjugated to a cytotoxic agent. 

4,14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 

30 any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

35 be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded 11 refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 
5 A variety of data storage structures are available to a skilled artisan for creating a 

computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
10 readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats text file or database) in order to obtain computer readable medium having recorded 

1 5 thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 

20 software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et aL, J. Mol. Biol. 215:403-410 (1990)) and 
BLAZE (Brutlag et aL, Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORPs may 

25 be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
30 present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 
35 therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 
5 As used herein, "search means" refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 
1 5 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
3 5 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et ah, Science 15241:456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
5 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

10 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
15 with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
5 and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
1 0 extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
1 5 provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
20 containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
25 sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
30 reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 

4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
5 a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 

pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 
10 Using the isolated proteins and polynucleotides of the invention, the present invention 

further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 
1786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid. In detail, said method comprises the steps of: 
1 5 ( a ) contacting an agent with an isolated protein encoded by an ORF of the present 

invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 
polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
20 the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
25 invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
30 sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
35 activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
5 invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity /expression. 

The agents screened in the above assay can be, but axe not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 
10 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
15 readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 
20 La addition to the foregoing, one class of agents of the present invention, as broadly 

described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
25 multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 
30 Agents suitable for use in these methods preferably contain 20 to 40 bases and are 

designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
35 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 
5 Agents which bind to a protein encoded by one of the ORFs of the present invention can 

be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

10 4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-1786 and 3573-5358. Because the corresponding gene is only 
expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO: 1-1 786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,1 88 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verrna et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1994 Genome Issue of Science (265: 198 If). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

1 0 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides,^., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 

1 5 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye& Hondo, (1990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagata et al, 1 985; Dahlen et al, 1 987; Morrissey & Collins, (1 989) Mol. Cell 
Probes3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1988; 1989); all 

20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al (1 994) Proc. Natl. Acad. Sci. USA 91(8)3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 

25 Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 

Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
30 surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5-end by a phosphoramidatebond, allowing immobilization of more than 1 pmoi of DNA 
(Rasmussene/a/., (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussen et aL, (1991). In this technology, a phosphoramidate bond is employed 
(Chu et aL, (1 983) Nucleic Acids Res. 1 1(8) 65 1 3-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidate bond joins the DNA to the 
5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 
1 0 More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 

denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 1-methylimidazole, 
pH 7.0 (1-Mehn 7 ), is then added to a final concentration of 10 mM 1-Melm 7 . A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide 0.2 M 1 -ethyl-3-(3-dime%laminopropyl)-carbodiimide (EDC), dissolved in 
15 10 mM 1 -Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 
20 described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotection may be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
30 Vodoxetal (1991)Science 251(4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et aL (1 99 1 ) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5-amine of * 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
5 light-generated synthesis described by Pease et al, (1 994) PNAS USA 91(11) 5022-6, incorporated 
herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5-protected A^acyl-deoxynucleoside phosphoramidites, surface linker chemistry and versatile 
1 0 combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
1 5 including mRNA without any amplification steps. For example, Sambrook et al (1 989) describes . 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3 , plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 

20 may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al (1 989), shearing by ultrasound and NaOH treatment. 

25 Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 

Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

30 fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CviJl normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme (CwJI* *), yield a quasi-random distribution of DNA fragments form the small 
molecule pUC19 (2688 base pairs). Fitzgerald etal. (1992) quantitatively evaluated the 
randomness ofthisfiragmentation strategy, using a Cv/JI** digest ofpUCl 9 that was size 
fractionated by a rapid gel nitration method and directly Iigated, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysis of 76 clones showed that Cv/JI** restricts pyGCPy and 
1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
20 chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiterplate) to repeated by transfer of about 20 nl of a DNA solution to a 

25 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 

30 subarrays may represent replica spotting of the same samples. In one example, a selected gene 

segmentmay be amplified from 64 patients. For each patient, the amplified gene segmentmay be in 
one 96-weIl plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
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Subarraysmay contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multi well plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
1 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5.1.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 

25 using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 

inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDN A libraries were spotted on nylon membrane filters and screened 
with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

30 In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 

Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDN A Ends) was performed to further extend the sequence in the 5 ' direction. 
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5.1.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 1 14, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
1 0 with BLAST score greater than 300 and percent identity greater than 95%. . 

A polypeptide was predicted to be encoded by each of SEQ ID NO:3573-5358 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fastabioch.virginia,edu^ which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 183:63-98 
1 5 (1 990), herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5.2.2 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 1 - 327. 

25 Table 1 shows the various tissue sources of SEQ ID NO : 1 -327 . 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 117, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in.Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
BioL, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 

10 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

1 5 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 
20 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 1 1 7, 
25 UniGene version 1 17, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1 41 3. 

Table 1 shows the various tissue sources of SEQ ID NO: 328-1413. 
30 The nearest neighbor results for SEQ ID NO: 328-1413 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 1 18, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al v J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 - examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 

1 5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

25 53.2 EXAMPLES 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 1 17, 

UniGene version 117, Genpept release 1 17). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1414-1 652. 
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Table 1 shows the various tissue sources of SEQ ID NO: 1 414-1 652. 

The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 1 18, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 1414-1652 from 
Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 

shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielsen, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1 , pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.4.2 EXAMPLE 6 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 18, gb pri 1 1 8, 
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UniGene version 1 1 8, Genpept release 1 1 8). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 653-1 745. 
5 Table 1 shows the various tissue sources of SEQ ID NO: 1 653-1 745. 

The homology for SEQ ID NO: 1653-1745 were obtained by a BLASTP version 2.0al 
19MP-WashU search against Genpept release 118, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1653-1745 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
10 with identifiable functions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et ah, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
1 5 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnharnmer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
20 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
25 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
30 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5.2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 119, gbpri 119, 
5 UniGene version 1 1 9, Genpept release 119). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hy seq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 1746-1768. 
Table 1 shows the various tissue sources of SEQ ID NO: 1 746-1768. 
1 0 The homology for SEQ ID NO : 1 746-1 768 were obtained by a BLASTP version 2.0al 

19MP-WashU search against Genpept release 119, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 
15 Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

Biol., Vol. 6 pp. 219-235 (1999) herein incorporatedby reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the positions) of the signature within the polypeptide sequence. 
20 Using the PFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

25 The nucleotide sequence within the sequences that codes for signal peptide sequences and 

their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication w 

30 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol 10, no. l,pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6.2 EXAMPLE 8 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

5 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 

checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 20, gb pri 120, 

UniGene version 1 20, Genpept release 120). Other computer programs which may have been used 

in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 

ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 

10 sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 

variants resulting from these procedures are shown in the Sequence Listing as SEQIDNOS: 1769- 
1786. 

Table 1 shows the various tissue sources of SEQ ID NO: 1 769-1 786. 

The homology for SEQ ID NO: 1 769-1786 were obtained by a BLASTP version 2.0al 
15 19MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. The results showed homologues for 
SEQ ID NO: 1769-1786 from Genpept. The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
20 Biol., Vol. 6 pp. 2 1 9-235 (1 999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
25 pp. 320-322 ( 1 998) herein incorporated by reference) all the polypeptide sequences were 

examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
30 their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5 Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 



Tissue Origin I RNA Source 



adult brain 



GIBCO 



Hyseq 
Library Name 



AB3001 



SEQ ID NOS : 



9 19-21 50-51 65-66 72 78 80 82 
85 87 107-108 113 116 123 138 
140 150-152 159 169 177 192-193 
202-203 212-214 225-226 235-236 
251 258 268-269 272 280-281 295 
298 301 321 326 331-332 334 356- 
357 362 369 379 382-383 416 423 
443 4S9-460 473 475 477 488 496 
500 503 519 526 547 574 582 587 
608-609 613 618 633-634 645-646 
652 657-658 660 669-671 678 687 
695 697 710 715 724 731 775-777 
796 804 811 857-859 862 869 899- 
900 912 919 922 924-929 933 936 
962 979 988-989 996 1001 1004- 
1008 1018 1039 1047 1059 1064 
1067 1070 1078 1082 1107 1113 
1116-1117 1131 1134-1137 1140 
1149 1151 1157 1180 1206 1229 
1234 1241 1243 1258 1272-1273 
1279 1288-1290 1294 1307-1308 
1312 1320 1323 1330 1356 1360- 
1361 1368 1373-1375 1379 1391 
1400 1417 1446 1468 1482 1493- 
1494 1501-1503 1506-1507 1512 
1517 1522-1524 1530-1533 1537 
1549 1565 1578 1598 1606 1608 
1623 1625 1627 1639 1643 1648- 
1649 1653 1664 1667 1671 1696 
1734 1741 1743-1744 1760-1761 
1771 



GIBCO 



ABD003 



3 12-14 18-19 25 30-31 34-36 43- 
45 50-51 56 58 60 65-66 68-69 80 
82 8S 87 92 104 107-108 112-113 
115-116 123-124 131-132 135-137 
139 142 146 148-149 152 154 157 
1S9 163 165 167 169 172 180 192- 
193 196-197 199 203 208 210 212- 
214 223 233 235-237 247 257 259 
261 268-269 272 276 280-281 284- 
288 291-292 295 297 300-301 304 
307 317 320-321 323 327 329-331 
333-334 34S-349 356-357 379-381 
393 401 408 414 419 424 426-428 
430 433-436 438-439 443 445 449 
453-454 459-461 468 471-473 476- 
478 483 491 494 496 500 503 507- 
508 516 519-520 525-527 534 536- 
540 542-543 545 553 555 560 569- 
570 574-576 586-588 593 595 597 
601 606-609 615-620 622-623 625 
628-633 635-636 643 645-649 653 
655-656 660-665 668-670 676 681 
687 701 710 715 717 724-728 735 
743 745-746 750 753 759 76S-766 
773 775-778 786 789 796 799-800 
802-803 810-811 815 817 820-821 
832 834-836 840 845-847 851 858- 
861 864 869 874 878 883 897 901- 
902 904-905 908 911-914 916 921- 
922 924-927 929 932-934 936-939 
941-942 945 955-958 963 966-969 
977 979-980 985-986 990 992-993 
997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 10S9 1063-1066 
1078 1081-1082 1085-1086 1089 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SBQ ID NOS: 



1097 
1117 
1134 
1158 
1190 
1217 
1241 
1267 
1289 
1316- 
1344 
1374 
1394 
1425 
1456 
1478 
1497 
1522 
154 8 
1565 
1591 
1611 
1630 
1645 
1664 
1686 
1711 
1731 
1747 
1761 



1103 
1119 
1144 
1167 
1193- 
1220 
1243 
1269 
1293- 
1320 
1348 
1377 
1400 
1427 
1458- 
1482- 
1499 
-1524 
-1550 
1567 
1593 
1620 
1632 
1647 
1667 
1690 
1719 
1733 
1749 
1765 



1107 
1121 
1145 
1170 
1194 
1226- 
1247 
1279 
1294 
1326 
1351 
1380 
1409 
1437 
1459 
1483 
1506 
1530 
1552 
1569 
1595 
1621 
1636 
1649 
1669 
1694 
1722 
1738 
1753 
1771 



1109 
1124 
1149 
1178 
1200 
1227 
1252 
1281 
1306 
1333 
1355- 
1386 
1414 
1443 
1468 
1487 
1508 
1533 
1557 
1571 
1598 
1624 
1640 
1653 
1673 
-1696 
1723 
1740 
1757 
1785 



1112 
1127 
1151 
1184 
1202 
1229 
1258 
1284 
1307 
1338 
1357 
1389- 
1422- 
1446 
1470- 
•1488 
1511 
1545- 
•1559 
1586 
-1601 
-1626 
-1641 
165S 
1678 
1701 
1726 
1743 
-1758 



1116- 
1130 
1157- 
1188 
1215- 
1231 
1263 
1286- 
1312 
1341 
1368 
1390 
1423 
1454 
1472 
1493 
1517 
•1546 
1563 
1588 
1608 
1628 
1644- 
1657 
•1681 
1709 
-1727 
1744 
1760- 



adult brain 



Clontech 



ABR001 



9 29 68-69 113 115 146 152 206 
223 245 277 307 320 324 330-331 
344 348 352 362 379 384 393 404 
408 414 441-442 4S4 469 481 490 
506 517 586 597 631 641 659 691 
715 799 803 833 865 871 875 880 
882 908 920 937 1000 1005-1006 
1027 1036 1041 1043 1075 1107 
1112 1121 1127 1136-1137 1144- 
1147 1231 1238-1239 1280 1293 
1320 1345 1355 1361 1383-1384 
1400 1417 1448 1456 1476 1507 
1570 1572 1609-1610 1614 1620 
1626 1645 1653 1754 1759 1770 
1786 



adult brain 



Clontech. 



ABR006 



5-8 15-16 168 212-213 271 278 
280-281 291-292 300-301 310 314 
321 326 336-338 341 352 357 359- 
360 362 369 374 379 384 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 689 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1252 1265 
1285 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
1557-1559 1586 1588 1651 16S3 
1664-1655 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1751 



adult brain 



Clontech 



5-10 13-19 22-23 25 29 33 37-39 
43-45 50-51 54-55 57-53 60-66 
68-70 72 75 77-80 83 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 148 152 154-155 157 166 168- 
172 174-175 181-184 188-190 193- 
194 196 198-200 202 204-205 207- 



ABR008 
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BNSDOCID: <WO 0153312A1_l_> 



WO 01/53312 



PCT/USOO/34263 



Tissue Origin 


RNA Source 


Hyaeq 
Library Name 


S2Q ID NOS: 








208 


210 


214 


-215 218 221 


-226 229 








231 


-232 


234 


-241 245-247 


251-253 








255 


2S7- 


-259 


268-269 271 


276-281 








285 


-286 


288 


290-292 300 


-302 304 








307 


309- 


311 


313 315 317 


-318 320- 








322 


325- 


-326 


328 330-331 


333-338 








341 


344- 


-347 


349 352 


354 


356-357 








362 


369- 


373 


376 37S 


-380 


382 384 








387 


390- 


3S1 


393-394 


397 


399-403 








405 


-411 


414 


-415 417 


-420 


426-428 








437- 


-438 


440 


-444 453 


-455 


462 464 








467 


469- 


471 


476 478 


4 32 


-484 488- 








491 


497 


503 


506-513 


516- 


-517 520 








524- 


-526 


528 


-530 532 


-534 


537-540 








. 542 


544 


547 


-551 553 


561 


565-567 








572- 


-574 


577 


581 585 


587- 


-588 S90- 








591 


597 


599 


601-602 


606- 


-610 612 








615-617 


619- 


-620 622 


-623 


628-629 








631 


633- 


634 


636-641 


643 


64S-647 








651- 


•653 


655-664 669 


-671 


673 679 








682 


687 


689 


691-700 


702 


706 710 








715- 


717 


720- 


-721 72S 


-734 


736-739 








742- 


743 


746 


750-752 


756 


758-759 








762- 


764 


766 


768 773 


-778 


780-782 








734- 


785 


787- 


789 794 


796 


799 802- 








803 


805 


811 


814-815 


618 


825-826 






- 


834- 


837 


839- 


840 842 


-843 


856-859 








861- 


862 


865 


867-872 


874- 


875 881 








883- 


884 . 


887 


889-892 


894- 


895 897- 








898 


901 


904 


908 910 


912 


914 917 








919 


921- 


924 


926-927 


930- 


932 935- 








941 


943 


945 


949 953 


-954 


958 961- 








963 


967 


969 


971 975 


977 


981-983 








986 


988- 


990 


992 997 


999- 


1002 








1004 


-1006 1008 1012 


1018 


-1023 








1027 


102 


9-1031 1035- 


-1037 


1047- 








1048 


1053 1057 1059 


1063 


1068 








1070 


1072-1075 1077 


1081 


-1083 








1085 


-1093 1095-1096 


1108 


-1112 








1114 


-1125 1127 1131- 


-1133 


1135- 


- 






1138 


1142-1145 1148-1158 


1160- 








1163 


1167 1169 1172 


1175 


1177 








1180 


1183-1188 1191-1195 


1199- 








1200 


1204 1206 1211 


1213 


-1216 








1222 


-1223 1226-1227 


1229 


-1231 








1234 


-1235 12 


41-1242 


1244 


-1263 








1266 


1269-12 


71 1276- 


1277 


1279- 








1281 


1284-12 


B6 1292 


1294 


-1295 








1299 


1305-1309 1312 


1314 


1316- 


* 






1319 


1322 1324-1327 


1330 


1332 








1334 


-1335 1339 1344- 


1346 


13 51 








1354-1355 1357-1358 


1365- 


-1367 








1369-1370 1373-1374 


1376-1379 








1381-1384 


1386-1388 


1392 


1394 








1396- 


-1397 


1400 1403- 


1407 


1410 








1414 


141S 


-1420 1423 


1432-1433 








1435 


1437 


-1438 1440- 


1442 


1446 








1448 


1453 


-1455 1457 


1461 


1463- 








1464 


1466 


1468 1471 


1477 


1480 








1482- 


1483 


1496 1502- 


1504 


1507- 








1509 


1513 


1519-1520 


1524- 


1526 








1536 


1547 


1549-1552 


1567 


1573- 








1574 


1578 


1586-1589 1597- 


1598 








1601- 


1602 


1605 1607-1609 


1611- 








1617 


1619 


-1621 1623 1625- 


1626 








1635- 


1641 


1643-1645 1649 


1651 








1653 


1656 


-1658 1664 1669 


1671- ! 








1674 


1676 


-168 


4 1686 1639- 


1690 








1694- 


1696 


1704-1705 1708- 


1709 
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BNSOOCID: <WO_ 



_0153312At_L> 



WO 01/53312 



PCT/US00/34263 



Tissue Origin 



RNA Source 



Hyseg 
Library Name 



SEQ ID NOS: 



1720-1724 1726-1728 1730-1733 
1737-1740 1742-1745 1753 1756- 
1757 1759-1761 1765 1767 1771- 
1772 1776-1777 1779-1780 1786 



24 75 103 186 210 310-311 364- 
365 508 623 710 937 1002-1003 
10S9 1204 1609 1731-1732 



adult brain 



Clontech 



ABR011 



adult brain 



BioChain 



ABR012 



46 182-184 204-205 300 739 767 
1371 1549 1620 1684 



adult brain 



Invitrogen 



ABR013 



185 204-205 364-365 393 497 595 
687 692-694 830 845 1068 1320 
1413 1640 



adult brain 



Invitrogen 



ABR014 



187 301 357 364-365 375 454 463 
731 859 939 983 1073 1262 1270 
1320 1403 1640 1651 1657 1696 
1722 1738 



adult brain 



Invitrogen 



ABR015 



419 434-435 441-442 763 789 983 
1320 



312 364-365 379 1320 1334-1335 
1674 1722 1785 



adult brain 



Invitrogen 



ABR016 



14-16 22-23 25 37-39 43 58 60 
70-72 78 86 94 107 113 116 136- 
137 143 146 152 161 173 182-184 
194 196 1S8 210 218 229 259 267 
295 298 309-310 320-321 324 336- 
338 346-347 349-350 356-357 362 
371 379-380 382-383 391 393 396 
399 401 408 428 438 459 461 476 
482 490 502 507-509 516 526 531 
557 562 597 602 607-609 624 652 
655 667 669 671-672 687-689 695- 
696 710 712 715 721 732 739 743 
750 753 766 770 780-781 789 803 
814 826 830 837 841 857 869 874 
894-895 925 937 949 954-956 960- 
951 963 968-969 988-989 1000 
1005-1006 1016-1019 1021 1036- 
1037 1052 1096 1090 1109 1113 
1115 1120-1121 1123-1124 1136- 
1137 1140 1144-1147 1151 1167 
1170 1174 1188 1193-1194 1205 
1225 1229 1231 1254 1258 1262 
1280 1285 1309 1312 1334-1335 
1341 1343-1344 1356-1357 1370 
1378-1379 1383-1384 1403-1404 
1423 1429 1434 1442 1448 1451- 
1452 1454 1470-1472 1482 1499 
1525 1528-1529 1532 1536 1547 
1554 1557-1559 1561-1562 1567 
1535 1588 1590 1595 1601-1604 
1608 1610-1613 1615 1619 1624 
1627 1640 1644 1647 1660 1664 
1666 1670 1675 1696 1704 1715 
1723 1727 1738 1760-1761 1768 
1779 1785-1786 



adult brain 



Invitrogen 



cul tured 
pr eadipocy t e s 



Strategene 



ADP001 



5-8 11 17 25 68 
105 110 116 136 
189 196-198 261 
301 318 331 336 
400 428 430-431 
527 549 S57 561 
631 637 647 670 
748 782 793-794 
845 858-859 879 
960 982 986 995 
1005-1007 1025 
1039 1045 1071 
1102 1136-1137 



-69 80 82 87 103 
-13S 168 171 188- 
267 276 288 293 
338 379-380 391 
SlO-512 520 524 
602 618 620 622 
681-682 710 731 
817 834-836 843 
882 893-895 934 
-996 1000 1002 
1027-1028 1032 
1078 1097 1399- 
1140 1219-1220 
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BNSDOCID: <WO 0153312A1_L> 



WO 01/53312 



PCT/USOO/34263 



Tissue Origin 



adrenal gland 



adult heart 



RNA Source 



CI on tech 



Hyseq 
Library Name 



ADR002 



SEQ ID NOS: 



1260 

1322 

1370- 

1437 

1602 

1660 

1711 

1760- 



1271 

1329 

1371 

1466 

1608 

1662 

1719- 

1761 



1297- 

1339 

1398 

1468 

1614 

1673 

1720 

176S 



1298 
1345 
1408 
1533 
1631 
1687- 
1742 
1767 



1314 

1365- 

1423 

1539 

1649- 

1688 

1746 

1771 



1320 
1366 
1431 
1594 
16S0 
1696 
1749 
1785 



4-10 15-16 25 29-31 43-45 47 50- 
51 5S 60 62-63 65-66 75 80 102 
116 118 122 126 130 137 150 169- 
170 181 192 198 201-203 215 227- 
228 247 2S1 255 267-269 271 280- 
281 285 295 298 311 336-338 342 
349 351-352 354 372-373 383-385 
391 400 410 415-416 424 426-427 
431 434-437 439 445 454 461 473 
477 483 491 493 497-498 503 516 
519 527 535 546 549 552 572-573 
581 588 595 600 602 608-610 620 
628-630 637 645-646 670 679 703 
713 715 719 732 734 744-746 758 
773-778 789 816 829 837 845 848 
869 875 883 898 904 912 S22-923 
930-931 942 948 952 965 967 969 
976-977 981 990 992-993 1001 
1004 1049 1055 10S9 1071-1072 
1076 1112-1113 1115 1121 1127 
1134-1135 1151 1158 1163 1175 
1181 1188 1209 1218 1224-1225 
1227 1231 1243 1270-1271 1274 
1280 1285 1290 1293 1307 1324- 
1325 1327 1330 1342-1343 1345 
1348 1365-1366 1369 1378-1379 
1387 1398 1400 1405 1417 1425- 
1426 1436 1440-1441 1444 1454 
1463-1464 1488 1491 1507 1512 
Z538 1546 1567 1573-1575 1588 
1598 1609 1614 1618 1622 1624 
1627 1634 1636 1649 1651 1658 
1671 1674 1678-1679 1691-1692 
1703 1717 1727 1731-1732 1737 
176S 



GIBCO 



AHR0O1 



4-8 10-11 15-16 
46 SO-52 57-58 
85 87 89 94 97 
110 112 114 116 
127 130-132 134 
147-151 153 163 
186 192 195 197 
215 220 225-226 
236 251 257-260 
277 280-282 285 
298-301 304 307 
325 330 333 336 
352 354 358 361 
384 387-398 391 
408-409 411-412 
433-439 445-446 
457 459 462 469 



483-484 
503 506 



487-490 
508 510 



526 534 536-540 
560-562 574-577 
587 589 593 595 
612 615-620 622 
645-652 656-660 
674-675 683-684 
701 709 712 715 



18-21 3 
60 62-63 
100 103- 
118-119 
136-138 
-164 168 
199 204 
229-230 
262 265 
286 289 
309 314 
-338 345 
368 370 
393 397 
414-416 
449 452 
472-473 
492-493 
513 516 
542 546 
581-582 
597 604 
623 626 
665-666 
687 692- 
716 719- 



4-39 44- 

71 75 82 
104 108- 
122-123 
141-144 
171 179 
-205 212- 
232 234- 
272 274 
-292 296 
321 324- 
349 351- 
380 383- 
401 406 
430-431 
454-455 
476-480 
496-498 
519-522 
549 553 
584 586- 
609 611- 
632 637 
670-672 
694 697 
720 725- 



109 



BNSOOCID: <WO_ 



WO 01/53312 



PCT/US00/34263 



Tissue Origin 


RNA Source 


Hyseq 
Library Name 






SEQ 


ID NOS: 










"726 728 73 


0-732 


735 


738-739 743- 








744 746 751 753 


759 


761 765 770- 








771 775-780 785 


788- 


790 7 


96 802 








804 810 812 817 


821 


826 828 830 








837 843 845-847 


849- 


853 857-861 








863-864 869 871 


875 


877-879 881 








883 8 


87 890-892 


894 - 


895 897-898 








901 903 906-907 


911- 


913 915 919 








921-925 92 


7-928 


933 - 


935 945 958 








961-963 967 969 


-972 


975 977-978 








980-986 990 992 


999- 


1002 


1UU3- 








1007 


1010 


1016 


1019- 


1020 










1023 


1025 


1028- 


103 7 


103 9 - 


1 040 








1043 


1047 


1050 


1054 - 


1055 


1057 








1059 


1063- 


1064 


1G67- 


1068 


1070 








1072 


1075- 


1076 


1083 


1085- 


1087 








1089 


1093- 


1094 


1104 


1106 


1108- 








1109 


1113 


1116- 


1117 


1119 


1121 








1124 


1126 


1128 


1131- 


1134 


1144- 








1145 


1148- 


1149 


1151 


1158 


1167 








1169-1170 


1175 


1177 


1192 


1196 








1199-1200 


1202 


1206- 


1208 


1211 








1216 


1218 


1222 


1227- 


1229 


1232- 








1235 


1238- 


1241 


1243- 


1244 


1247- 








1248 


1250 


1253- 


1254 


1256- 


-1258 








1261 


1268 


1270- 


1271 


1277 


1280- 








1282 


1287 


1292 


1298- 


- 1299 


13 06 








1308 


1317- 


-1321 


1324- 


•1325 


1330 








1332 


1334- 


-1337 


1339 


1344 


-1345 








1349 


-1350 


1354- 


-1356 


1359 


-1360 








1365 


-1366 


1369 


1371 


1374 


-1375 








1378 


-1380 


1383- 


-1384 


1389 


1397 








1400 


1403 


1409 


1417 


1423 


-1426 








1437 


1439 


1442 


1444 


1446 


-1447 








1450 


14S3 


1468 


1470 


1473 


1479 








1481 


1488 


1490 


1501- 


-1504 


1519 








1521 


1524 


1528 


1530 


-1534 


1536 - 








1537 


1539 


1541 


-1542 


1547 


1553 








1555 


1560* 


1565 


1567 


-1571 


1588 








1591 


1597-1598 


1601 


-1602 


IbUb 








1614 


-1616 


1619-1620 


1623 










1630 


-1632 


1634 


1636 


1641 


1644- 








1645 


1647 


1649 


1652 


-1655 


1659 








1662 


1667 


1673 


-1674 


1680 


-1681 








1684 


1686 


-1688 


1704 


-1705 










1711 


-1712 


1717 


1724 


1726 


-1727 








1731 


-1733 


1737 


-1738 


1741 


1743- 








1744 


1749 


1754 


-175S 


1760 


-1761 








1765 


1772 


1785 








adult kidney 


GIBCO 


AKD001 


4-8 


10-11 


17-21 29- 


31 35 


-39 42- 






45 50-51 


56-58 


60-61 64 


68-69 75 








77 80 82 


35 87 


92-94 97 


100 102- 


- 






104 


107-108 112 116 


-117 


119 123 








127- 


133 136-137 139 


-141 


143-144 








147- 


154 157 16 


1-163 


165- 


166 169 








172 


176 178-179 192 


194- 


197 199 








201 


203-206 209-210 


212- 


213 215- 








216 


223-228 234-236 


238 


247 251- 








253 


257-259 261-262 


265- 


269 271- 








272 


274 276-277 279 


-281 


234-286 








293 


293 2 


95 29 


8-299 


301- 


302 304 








307 


311-313 321 325 


-326 


329-331 








333 


341 344 348-350 


352 


356 358- 








359 


362 364-365 368 


370- 


372 374 








376- 


377 380-382 392 


395 


398 400- 








401 


404 4 


07-409 414 


-415 


423-424 








430- 


437 443-444 446 


449 


451 453- 








455 


459 461-462 464 


467 


469 471- 








474 


476-4 


77 480-481 


483 


487-488 
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BNSDOCID: <WO 0153312A1_I_> 



WO 01/53312 



PCT/USOO/34263 



Tissue Origin 


RNA Source 


Hvsea 






SEQ 


ID NOS: 








I*iforary Name 




















4 90- 


491 493 49 


7-505 


510- 


513 516 — 








S20 


522 524 526-529 


534 


537-540 








544 


547 549 554-556 


560 


562 564 








567 


571-576 578 582 


586- 


589 592- 








593 


598-599 601 604 


-606 


6C8- 61 ~\ 








615- 


619 621-626 632-634 


637-643 








645- 


652 655 660-664 


669- 


672 676 








678- 


679 6 


98 69 


2-695 


698 


702 711 








713 


717 7 


19-720 727 


731 


735-736 








738 


743 745-74 


5 751 


753 


755 762- 








763 


765 7 


71-773 775 


-778 


780 786 








788 


793 7 


95-79 


S 800 


803 


805 808 








810- 


812 8 


L4-81 


9 821 


826 


829 832 








834- 


838 842-845 848 


-855 


857- 861 








864- 


865 8 


57 86 


9 871 


874 


o / o o o o 








8 36- 


887 889-89 


I 893 


-896 


O jJ O — J7U u 








902 


906-908 910-914 


918 


920 922 








925- 


927 929-935 937 


940- 










94 8- 


949 951 953-958 


960- 










964 


969-970 972 976 


-978 


70^: - 700 








908 - 


990 992-993 995-997 












-1008 


1010 


1012 


-1013 


1016 - 








1017 


1019 


-1020 


1022 


1025 


-103 1 








1035 


1038 


-1040 


1042 


1044 


1047 










1054 


-1055 


1057 


-1064 










1070 


-1073 


1078 


1085 


-1086 


1088 — 








1089 


1092 


1094 


1097 


1099 


-1102 








1 1 m 

X J.U / 


1109 


-1112 


1116 


-1119 


1121 










-1125 


1132-1135 


1140 


1142- 








1143 


1146 


-1147 


1149- 


-1150 


1153- 








1154 


1157 


1159 


1163 


1167 


1170 








1178 


-1179 


1181 


1183 


1192 


11 96- 








i inn 


1202 


-1204 


1206- 


-1211 


1216- 








1 01 o 


1221- 


-1222 


1225 


1227 


-123 0 








■LA 


-1234 


1238- 


-1241 


1243 


-1244 










-1247 


1253 


1257 


-1258 


126 0 - 










1267- 


-1268 


1270 


1272 


-1274 








1281 


1283 


1287- 


-1239 


1293 


- 1295 








1299 


1306 


1308 


1311- 


-1313 


13 17 - 








1320 


1323 


1329 


-1330 


1334 


-1 -a -a c 








1339 


1341 


1349-1350 


1353 










13 59 


1367 


1369 


1373 


1375 


J. J to — 








1379 


1394 


1397 


1400 


1403 










1407 


-1409 


1417 


1419 


1423 


-1424 








1428 


-1431 


1433 


1437- 


•1438 


1442- 








1443 


1445- 


•1446 


1448- 


-1450 


1 >fl c ^ _ 








1454 


1459 


1461 


1465- 


-1468 


14 74 — 








1475 


1478 


14 84- 


-1488 


149U 


14 92- 








1493 


1495 


1497- 


•1498 


1506 


-1507 








1509 


1512 


1518 


1521- 


•1522 


1525 








1527 


-1528 


1532- 


1533 


1537 


1540- 








1541 


1547- 


1550 


1552 


1556 


-1559 








1561 


1565- 


1566 


1568 


1571 


1575 








1578-1579 


1583 


1586-1587 


1589 








1591 


-1592 


1594 


1598 


1600 


1603- 








1604 


1606 


1608 


1611 


1613 


1615- 








1616 


1618- 


1622 


1624- 


1628 


1631- 








1632 


1634- 


1636 


1638- 


1639 


1641 








1644 


1646- 


1649 


1653- 


1656 


1662 








1664 


1666- 


1667 


1670- 


1671 


1676- 








1679 


1683- 


1684 


1686 


1691 


-1692 








1696 


-1699 


1701 


1709- 


1711 


1713- 








1714 


1716- 


1719 


1723- 


1724 


1726- 








1727 


1733 


1737- 


1738 


1741 


1743- 








1744 


1748- 


1749 


1751 


1760 


-1761 








1763- 


-1768 


1776 


1780 


1785 




adult kidney 


Invitrogen 


AKT002 


20-21 37-39 47 


52 57 60 65-66 








68-69 80 104 107-108 122 


130 133 








136-137 140 142 


-143 


149 169 174 
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Tissue Orxgin 


RNA Source 


Hyseq 




SEQ ID NOS: 








Library Name 


















181 197 227-228 


235-236 244 251 








261-265 267 280- 


•281 286 2 


90 299 








301 3 


04-305 309 


312-313 339 341 








344-345 349 358 


370-372 376 382-. 








383 367 392 401 


414 416 421 430 








443 445 449 453- 


-454 472 4 


37-488 








504 506 513 516 


519 522 528 536- 








540 546 554 585 


587 594 598 602 








607 616-617 626 


-627 636 643 662- 








664 695 709 721 


735 743 761 768 








775-777 788 796 


804 


814 827 837- 








838 049-850 852 


-853 


869-870 881 








890-892 898 903 


905- 


907 914 919 








925 927 934 941 


949 


952 957 960 








962 968 970 100 


0 1008 1029-1030 








1044 


1052 1055 


1063 


1067- 


•1068 








1073 


1085 1099- 


1102 


1107 


1110- 








1111 


1113 1115 


1119 


1126 


1134 








1136- 


-1137 1146- 


1148 


1153 


1159 








1192 


1196 1199 


1232- 


1233 


1241 








1256 


1264 1272- 


1273 


1281 


1285 








1293 


-1294 1299 


1312 


1320 


1324- 








1325 


1330 1344 


1349 


1351 


1355- 








1356 


1369 1378- 


1379 


1403 


1414 








1419 


1428-1429 


1436 


1446 


1458 








1463 


-1464 1467- 


1468 


1470 


1477- 








1478 


1486 1491 


1509 


1519 


1527 








1529 


1534 1547 


1596 


1600 


1619 








1623 


1629 1631 


1634 


1638 


1643 








1647 


1652 1660 


1664 


1667 


1669- 








1670 


1673 1686 


1709 


1727 


1740 








1776 










adult lung 


GIBCO 


ALG001 


4-8 


14 37-39 44-46 50-51 


56 62- 






63 75 82 88 93 


103-104 1 


13 125 








133 


140 143 150 152 


154 


157 162 








171- 


172 174-175 190- 


•191 


196 200 








211 


214 219 223-224 


227- 


228 251- 








252 


256 265 272 274 


280- 


281 285 








310 


332 345 351 362 


371 


381-382 








394 


408-409 431 436 


445 


454 459 








461 


467 469 471 476- 


-477 


488 504 








513 


527 537-540 544 


547- 


548 554 








564 


583 607 616-617 


621 


623-624 








634 


645-646 662-664 


670 


695 716 








719 


743-744 763 766 


774 


789 803 








811 


814 817 831-832 


837- 


838 845 








'852- 


853 858-85 


9 861 


866 


880 887 








901 


905 941 954-957 


966 


971 977 








979 


981 987 990 992 


996 


1001 








1005-1006 1014 


1017 


1045 1047 








1054 


1059 1062 


1064 


1072 1080 








1086-1089 1094 


1107 


1126 1134 








1136-1137 1142 


1150 


1157 1173 








1190 1200 1208 


1220 


1241 1272- 








1273 1280 1282 


1295 


1306 1320 








1331-1332 1353 


1374 


1379 1383- 








1384 1404 1409 


1423 


1434 1436 








1442 1474 1478 


1494 


1509 1522 








1525 1531-1532 


1547 


1549 1553- 








1554 1571 1598 


16C6 


1613 1624 








1627-1629 i632 


1642 


1644 1662 








1569 1676-1677 


1684 


1696 1727 








17311732 1737 


-1738 


1748-1749 








1786 








lymph node 


Clontech 


ALN001 


4 24 50-51 82 


105 137 153 198 






201 


223-224 234 268 


-269 


272 280- 








281 


287 301 312 329 


343 


382 421 








430 


433 445 451 461 


-462 


475 481- 








482 


503 526 529 537 


-540 


546-547 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



PCT/US00/34263 



Tissue Origin 



young liver 



RNA Source 



GIBCO 



Hyseq 
Library Name 



ALV001 



adult liver" 



Invitrogen 



ALV002" 



SEQ ID NOS: 



621 626 649 679 719 
793 803 831 834-836 
858 866 879 905 913 
1005-1006 1012 1038 
1117 11S1 1199 1204 
1265 1274 1324-1325 
1374 1377 1440-1441 
1549 1600 1618-1619 
1644 1653 1687-1688 
1741 1771 



725-726 738 
838 844 857- 
928 963 976 
1050 1116- 
1226 1243 
1339 1353 
1447 1504 
1631 1641 
1691-1692 



5-8 11 20-21 46 50-51 58 65-66 
75 79 82 93 97 102-103 108 110 
116 139 143-144 148-149 171-172 
174 187-189 194-195 198 209 214- 
215 230 250 258 267-269 280-281 
306 309 342 351 356 359 362 372 
374 392 394 398 401 407-408 410 
414 431 444 455 459 476 478 483 
493 510-512 516 520 522 526 536 
549 571 574-577 585 592 601-602 
607 621-624 628-630 632-633 637 
648 660 666-667 678 697-698 700 
717 719 728 730 734 738 744-745 
766 770 773 779 788 800 808 812 
814 841 849-851 871 874 879 887 
893 898-900 902-904 906-907 911 
919 922 924 934 953 957 963 965 
970 984 986 997 1001 1004 1007 
1012 1029-1030 1033-1034 1052 
1061 1066 1070 1076 1086 1089 
1093 1099-1102 1110-1112 1116- 
1117 1119 1121 1125 1136-1137 
1144-1145 1156-1157 1159 1196 
1199-1200 1209 1211 1219-1220 
1241 1244 1262 1270 1275 1279 
1283 1295 1317-1320 1332 1339 
1344 1359 1362-1363 1379 1383- 
1384 1403 1415 1430-1431 1437 
1450 1467 1475-1476 1483-1484 
1494-1495 1498 1505 1512 1516 
1518-1519 1526 1529 1547 1550- 
1552 1557-1559 1S65 1583 1587 
1597 1609 1614 1620 1631 1637 
1641 1644 1654-1655 1662 1667 
1669 1684 1691-1692 1702 1711 
1725 1738 1741 1743-1744 1758 
1760-1761 1763-1765 1769 



5-8 17 20-21 32-33 41 55 58 64 

75 77 86 89 102 108 117 119 175- 
176 198 200 209 231 235-236 250 
272 275-276 284 306 316 321 325 
333 356 359 374 376 398 401 408 
414 428 430 433-435 454 476 494 
503-505 517-518 528 534 544 552 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 702 
794 814 820 826 834-837 847 849- 
850 858 861 874 879 893 898 904 
911 918 921-922 926 946 948 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
1159 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1S24 1542 1547 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult liver 



Clontech 



adult ovary 



Invitrogen 



ALV0 03 
AOV001 



1550 1567 1578 1581 1583 1594 
1597 1601-1602 1611-1612 1615 
1618-1619 1621 1625 1637 1645 
1647 1652 1654-1655 1660 1666 
1669-1671 1684 1706 1722 1737- 
1738 1742-1744 1760-1761 1753- 

1765 1772 1774 

29 676 997 1063 1119 1536 1766 



1 4-18 20-23 29 35-40 42-48 50- 
51 53-58 61-63 65-66 68-69 73-75 
77-78 80 82 85 87 89 97 100-101 
103-104 106-108 110 113 115 118 
122-124 126 128 133-134 136-140 
142 145-147 149-157 161 166 168- 
170 174 177-173 180 182-186 188- 
189 192-203 207 209 211-215 219 
221-224 229-230 234 242-243 246- 
247 255 258 260-262 265-269 271- 
272 274 277-281 284-286 288 290 
295 299 301-302 304 307 309-311 
313-314 316 321 323-326 330 332- 
333 335-338 341 344 349 352-353 
356 358 360 362 370-372 376-377 
379-384 387 390-392 394 397-398 
400 403 408-410 412 414-416 423- 
424 426-427 430-435 439 443-446 
448-449 451 4S3-455 462-463 468- 
471 473 476-479 481-484 487 489- 
494 496-497 499-501 503-505 509- 
514 516-517 519-520 522 524 526 
528-534 541-544 546-547 549 552 
554-555 561-564 566-567 569-570 
572-573 575-576 579 581 S83 585- 
588 590-S91 593 595 597 599 601- 
605 607-613 615 618-622 624-627 
630 632-633 636-640 642 644-647 
649-652 654-65S 657-665 667-675 
677-678 681 683-684 692-695 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 750-751 753 758 
763 765 767 772-773 775-778 780 
783-784 786 788 790-791 794-796 
800 803 805 809-811 813-815 818- 
819 821-824 826 828-829 831-832 
837-838 843-850 852-857 859-864 
867 869 871-872 874-875 878-883 
887-888 890-895 898-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 948-951 953 955-958 
961-964 966-967 970-979 981-982 
985-986 988-990 992 995-997 999- 
1001 1004-1009 1011-1013 1016 
1019-1020 1024-1025 1029-1031 
1033-1035 1037 1039 1041-1047 
1050-1051 1054-1060 1062-1064 
1067-1070 1072-1073 1075-1076 
1078-1079 1085-1086 1089-1090 
1094-1096 1098-1103 11C6-1108 
1112-1117 1119-1120 1123-1127 
1131-1135 1142-1143 1146-1149 
1153 1156 1158 1163 1165-1166 
1169-1171 1173-1175 1177-1178 
1180 1183-1185 1190-1191 1195 
1197-1200 1202 1205-1214 1217- 
1219 1221-1226 1232-1235 1238- 
1241 1243-1244 1247 1249 1252- 
1254 1256-1258 1262 1265 1267- 
-1268 1270 1275 1278 1280-1283 
1286-1289 1291 1293-1294 1298- 
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Tissue Origin RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1299 
1323 
1338- 
1359 
1377 
1394 
1427 
1443 
1463 
1481 
1494 
1507 
1526 
1538 
1553 
1567 
15/8 
1591 
1609 
1636 
1657 
1671 
1690 
1713 
1726 
1738 
1751 
1765 
1778 



1306 
1327 
1339 
1361 
137S 
1400 
1429- 
1445- 
1464 
1484- 
1496- 
1511- 
1527 
•1539 
1555- 
1569- 
1580- 
1595 
1611- 
1638 
1659- 
1673- 
1699 
1714 
-1728 
1740 
1753 
1767 
-1779 



1308 

1329- 

1341 

1365- 

1383- 

1404 

1431 

1450 

1466 

1485 

1498 

1517 

1530- 

1541 

1559 

1570 

1581 

1597- 

1621 

1641 

1662 

1674 

1702 

1715- 

1731 

1741 

1755 

1768 

1783 



1312 
1330 
1343- 
1366 
1384 
1416- 
1435 
1453 
1468 
1488 
1501 
1519 
1531 
1546 
1561 
1572 
1587 
1598 
1623 
1643 
1664 
1676 
1707 
1719 
1733 
1743 
1756 
1770 
•1784 



1317- 

1332- 

1351 

1371- 

1386 

1417 

-1436 
1454 
1470 
1491 

•1504 
1521- 
1534- 
154 8- 

-1563 
1574 

-1588 
1600- 

-1630 
1645 
1667 

-1681 
1710 
1723 
1735 

-1744 
1760 

-1771 
1786 



1321 

1333 

1356 

1375 

1389 

1422- 

1439- 

1459 

1474- 

1493- 

1506- 

1524 

1536 

1550 

1566- 

1575 

1590- 

1606 

1634 

1647- 

1669- 

1683- 

1711 

1724 

1737- 

1748- 

1762 

1776 



Clontech 



5-8 44-45 90-91 107-108 159 178 
311 351 414 476 503 545 574 624 
636 719 755 773 860 890-891 924 
947 955-956 962 "990 992 1002 
1045 1202 1320 1369 1628 1686 
1713-1714 1743-1744 



adult placenta 



APL001 



placenta 



Invitrogen 



APL002 



14-16 26 29 43 60-61 
106 116 135 171 177 
198 210 216 235-236 
309 329 334 339 359 
423 430 434-435 448 
491 517 522 631 723 
738 746 769 818 843 
858 916 948 953-954 
1005-1006 1013 1033 
1068 1070 1086 1139 
1160 1277 1285 1317- 
1345 1429 1435 1438 
1486 1490 1512 1519 
1592-1593 1602 1626 
1664 1673 1675 1722 
1746 1776 



79-80 103 
180 194 196 
272 290 299 
379-380 417 
454 483 490- 
725-726 728 
854-855 857- 
976 988-989 
1036 1064 
1144-1145 
1320 1343 
1454 1482 
1532 1549 
1647 1649 
1727 1730 



adult" spleen 



GIBCO 



ASP001 



3 5-8 12 15 
44-45 57 60 
103 106 108 
147 152-153 
178-180 196 
215 219 234 
272 280-281 
325 333 341 
387 394 406 
448 451 473 
505 517 519 
554 557 574- 
611-612 620 
652 659 661 
700 721 728 
746 762 765 
810-811 817 
852-853 858 



16 19-21 24 
82-83 87 89 
117 119-121 
155 166 169 
198 201-206 
253-254 256 
290 295 302 
349 358 372 
414 431 434- 
481 490-493 
530 534 536- 
S76 582 592 
621 623 631- 
667 671 673- 
730 732 738 
774 780 788- 
822 830 832 
862 866 874 



29 34-36 
94 98-99 
139 141 
171 174 
209-211 
258 264 
309 312 
382 386- 
436 446 
500 503 
540 547 
595 604 
632 642 
675 684 
742-744 
789 794 
845 848 
B79 882 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



testis 



GIB CO 



ATS001 



-884~906-9O8 912 919 9^1-923 926- 
927 934 942 949 957-958 963 977- 
978 983 990 992-994 996-997 999 
1005-1007 1010 1012 1031 1036 
1042-1044 1046 1049 1059 1068 
1070 1076 1089-1090 1094 1103 
1109 1113 1115 1124 1140 1163 
1170 1174 1177 1190 1196 1219- 
1220 1226-1227 1229 1236 1241 
1246 1258 1269 1271 1274 1295 
1301 1320 1322 1330 1334-1335 
1339 1349 1351 1353 1359-1360 
1364 1369 1374 1386 1397 1413 
1417 1434 1436-1437 1439 1468 
1474 1477 1480 1485-1487 1498 
1512 1522 1525 1544-1549 1553 
1S60 1567 1591 1600 1631 1636 
1651 1654-1655 1658 1662 1670 
1674 1678-1679 1684 1686 1700 
1727 1733 1738 1740-1741 1760- 
1761 1774 1779 1781-1782 



5-8 10 26 30-31 47 50-51 57 68- 
69 82 84-85 97 102 113 119 137 
139 150 152 154 156 163 169 174 
176-177 192 194 196-197 212-215 
227-228 247 255 258 261 282 285 
288-289 301 307 311 316 330 334 
349 370-372 392 398 410 415 426- 
427 430-431 433 437 446 454 461 
469 473 477 481-482 493 499 502- 
503 513 522 526 547 552-553 563- 
564 572-573 575-576 581-582 585 
599-602 605 612 615-617 620 631 
637 647 649-650 656 660 665 670 
674-675 712 719-721 723 728 731 
738 744 746 773 780 784 78.8-789 
802 804 809 811 814 826 831 837 
843 845 848 859 866 869 877 905 
913 916 919 921 926 929 937 950 
960 963 971 975 977 981 990 992- 
993 1007 1016 1029-1030 1034- 
1035 1038-1039 1045 1059-1060 
1064 1070 1072-1073 10B7 1089 
1097 1099-1102 1104 1108 1113 
1141 1149 1161-1162 1175 1208- 
1209 1222 1227 1229 1231 1235 
1238-1239 1243 1253 1285 1287- 
1289 1291-1293 1307 1311 1317- 
1320 1330 1332 1338 1345 1369 
1373-1374 1379 1389 1399-1400 
1409 1423-1424 1430 1435-1437 
1443 1459 1484 1486 1490 1493 
1496-1497 1501 1505 1509-1513 
1527 1530-1531 1533 1537 1546 
1549 1563 1565 1567 1569 1571 
1577 1586 1591 1599 1602 1625 
1628 1630-1632 1636 1639 1642 
1649 1661-1662 1666-1667 1670 
1675 1664 1690 1699 1705 1712 
1717 1724 1730 1737-1738 1752 
1767 1779 



Genomic DNA 
from BAC 63118 



Research 
Genetics 
(CITB BAC 
Library) 



BAC001 



Genomic DNA 
from BAC 39316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



686 1352 1412 



1411^1412 
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Tissue Orjgxn 



Genomic DNA 
from BAC 39316 



adult bladder 



RNA Source 



Research 
Genetics 

(CITB BAC 
Library) 



Invatrogen 



Hyseq 
Library Name 



BAC003 



BLD001 



Clontech 



SEQ ID NOS: 



1352 



5-8 17 18 22-23 33 37-39 56-57 
80 93 100 120-121 169 201 237 
251-252 272 278 311 348 3S3 382 
413 415 424 430 443 483 502 542- 
543 562 564 607 616-617 626 635 
652 667 671 710 727 755-756 762 
773 786 789 837 840 866 893 898 
909 918 929 966 977 983 1016 
1025 1055 1073 1082 1140 1167 
1185 1189 1199 1270 1369 1481 
1536 1560 1573 1596 1614 1636- 
1637 1649-1650 1654-1655 1658 
1669 1671 1690 1719 1727 1731- 
1732 1739 1741 1760-1761 1779 



bone marrow 



BMD0 0 1 



3-8 11 13 18 29-31 33 35-36 40 
43-45 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 178-180 
187 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
235-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
295 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 488 493 498 500 503-506 
513 516 519 523-524 526 530 535- 
540 542 544-545 549 555 565 567 
569-577 581 583-586 588 593 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 656- 
660 666 670 672 674-675 679 683 
701 708 716 718-720 731 735-736 
740-742 744-745 752 761 765 772- 
773 775-778 780 785-786 789-791 
796 798 802 810-812 823-824 826 
830 832-833 837-838 843-844 848- 
855 858-859 866-867 869 878-880 
883 890-892 896 903 905 908 912- 
914 922-924 927 930-931 937 939- 
941 952-953 955-958 963 969 973 
976 981 985 987 990 992 995 1000 
1002 100S-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1126 1134-1135 1142 1144- 
1145 1163 1172 1178 1197 1199- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1261 1266 
1270 1278 1281 1295 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 
1346 1349 1353 1356 1361 
1369 1372-1374 1379-1380 
1400 1403 1406 1408 1413 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 1459 
1463-1464 1482 1486 1493^1494 



1343 
1367 
1394 
1417 
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Tissue Origin 



RNA Sourct* 



Hy seq 
Library Name 



SEQ ID NOS: 



bone marrow 



Clontech 



BMD002 



bone marrow 



Clontech 



1506 
1526 
1546 
1557- 
1592 
1626- 
1638- 
653- 
1684 
1713- 
1727 
1772 



BMD004 



1503 
1528 
1548- 
1559 
1597- 
1628 
1639 
1655 
1686 
•1714 
1737 
1781 



1513 
1531 
1549 
1571- 
1600 
1630- 
1641 
1661- 
1690 
1717 
-1738 
-1782 



1521- 
1536- 
1552 
1572 
1609 
1632 
1646 
1662 
1702 
1720 
1740 
1785 



1537 
1554- 
1581 
1614 
1634 
■1647 
1676 
1707 
1722 
1758 
1736 



1543 
1555 
1589- 
1621 
1636 
1651 
-1681 
1711 
-1723 
1767 



^1 15-16 19 30-31 35-36 68-69 75 
83-84 93 99 103 108-109 118 137 
139 169-170 174 177 180 190 193 
212-213 219 222 225-226 232 237 
255 259 2S4 273-274 284 286 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 350 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
465-466 472 475 478 491 493 516 
520 523 525 531 545 548 552 566 
569-570 581 583 590-591 597-598 
601 616-617 621 641 650 652 656 
659 671 674-675 679 684 710 718- 
719 728 734 737-738 742 761 76S 
774-778 790 811 814 818 830 834- 
836 854-855 859 866 869 871 878- 
879 884 889 892 904 922-923 932 
990 992 99B 1001 1004 1016 1036 
1042 1048 1051 1054-1055 1058 
1088-1089 1106 1112-1114 1155 
1157 1192 1200 1223 1227-1228 
1236-1237 1260-1261 1282-1283 
1285 1287 1295 1314 1317-1321 
1324-1327 1330 1333 1341 1343 
1347 1350 1353 1355-1357 1367 
1369-1370 1373 1377 1379 1381 
1383-1384 1394 1397 1400 1406 
1413 1417 1425-1427 1438 1442 
1446 1459-1460 1470 1493 1505 
1521 1536 1546-1549 1560 1573- 
1574 1578 1598-1600 1621 1626 
1631 1634 1646 1649 1653 1656 
1658 1669-1670 1683-1684 1687- 
1688 1690-1693 1696 1699 1702 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 1752 1755 
1760-1761 1767 1777 1781-1782 
1786 



73-74 503 922 1036 171l" 



bone marrow 



Clontech 



BMD007 



95-96 866 1320 1475 



adult colon 



Invitrogen 



CLN001 



17 56-58 103 110 117 144 150 171 
179 185 188-189 201 204-206 210 
218-221 225-226 231 237 251 277 
288 310 312 320 333 359 386 388 
394 408 420 455 481 485 503 510- 
512 590-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826-827 848-850 854-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1017 1020 
1025 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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Tissue Origin 



Mixture of 16 
tissues - 
mRNAs 



Mixture of 16 
tissues - 
mRNAs" 



RNA Source 



Various 
Vendors 



Various 
Vendors 



Hyseq 
Library Name 



CTL016 



CTL021 



SEQ ID NOS; 



1462-1464 1512 1556 1583 1587 

1594 1596 1614 1625-1626 1631 

1639 1645 1650 1675-1677 1687- 

1688 1701 1713-1714 1724 1740 
1765 



401 1490 1666 



312 7B2 1132-1133 1403 1712 1715 



BioChain 



CVX0 01 



1 4-8 11 13 18-21 25-26 30-31 33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 85 94 100 103-104 113 
118 122 126 130 134 140 147 153- 
156 163 170 179 181 186 192 195- 
1S6 198 201-202 218-219 222 229- 
231 257 266 276-277 285-286 288 
298 301-302 304 307 312-314 324 
326 329-330 332 335 342 352 358 
362 371-372 376 379 381-382 384 
388 398 400 410 414 416 419-420 
426-427 430-431 433-436 439 446 
448 461-462 464 471-477 479 482- 
483 491 493 496 503 506 510-513 
516-517 526 530 535 542-544 546- 
547 S57 S61 572-573 57S-577 581- 
582 58S-S86 588-589 593-594 600 
602 604-605 607-609 612 615-619 
623 644 650 654 657-658 662-665 
670 672 680 683 691-694 698 706 
708-709 711 713 720-721 727 729 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 805 818 826 828 831- 
832 834-836 843 847-848 851-855 
857-860 864-866 869 871 876 878- 
880 882 887 890-891 897 899-902 
905-908 912-913 916 918-919 922 
927 932 934-938 944 948 955-956 
958 963-964 967 969-970 972 976 
978-979 983 985 990 992 1000 
1005-1007 1016-1017 1024 1027 
1033 1036 1038 1045 1047 1053- 
1056 1066-1067 1071 1073 1075 
1079 1082 1098 1113 1124 1129 
1134 1139 1146-1149 1163 1167 
1170 1173 1175 1177 1181 1197 
1200 1202 1211 1214 1216 1221- 
1222 1225 1227 1232-1234 1240- 
1241 1243 1258 1264-1265 1268 
1270 1279 1287-1290 1308 1310- 
1311 1316 1320 1323 1327 1345 
1349 1353-1354 1360 1372-1374 
1383-1384 1386 1394 1397 1405- 



The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain 
mRNA (Invrtrogen), 2) normal adult kidney mRNA (Inviirogen), 3) normal adult liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Invrtrogen). 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia lymphablastic mRNA (Clontech), 1 1) human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ ID NOS: 








Library Name 




















1406 


141 


5 1425-1 


427 


1431 


1436- 








143? 


1442 1446 1 


448 


1453 


1459 








1466 


1472 1478 1482 


1496 


1501- 








1S03 


1506 1512 1 


522 


1527 


-1528 








1 J J 1 


1533 1541 1547 


1569 


1571 








1585 


1589 1597-1598 


1600 


1608- 








1609 


1614-1616 1620 


1623 


-1624 








Ab/D 


-1628 1630 1638 


1641 


1643 








1649 


1653 1656 1662 


1667 


i fro 








1674 


-1675 1683 2 


685- 


1688 


1699 








1702 


1709-1710 1715 


1717 


1722 








1724 


1729 1731-3 


732 


1735 


-1739 








1741 


1743-1744 3 


748- 


1749 


1755 








1760 


-1762 1767 1773 


1778 


1785- 








1786 












diaphragm 


BioChain 


DIA0 02 


137 


282 


289 730 


780 


986 


1409 








1478 


1599 1614 








endothelial 


Strategene 


EDT001 


3 5- 


10 13 15-21 


24-26 29 


34 37- 


cells 


• 




39 42 44 


-45 50-51 53 


-55 


57-58 








60-61 65 


-66 68-69 73-74 


77-78 80 








82-83 85 


87 89 93-96 101 


-105 108 








110 


112- 


114 116 


118- 


122 


124 128 








133- 


134 


137-142 


147- 


150 


152-153 








161- 


163 


166-172 


176- 


179 


187 190 








1S2 


194 


196-201 


204-207 


210 212- 








214 


220 


224 229- 


-230 


233 


235-236 








240- 


241 


251-252 


258 


261- 


262 265 








267- 


269 


272 2/6- 


-277 


279- 


281 284- 








285 


288 


290 29S 


-296 


301- 


302 310- 








311 


313 


316 321 


325 


329 


331-333 








335 


340* 


342 351 


-355 


360 


371 375 








380- 


382 


364 387 


390 


392 


397 400 








407- 


408 


410 412 


414 


416 


425-427 








431 


434-436 439 


444-445 


449 454 








463- 


464 


472-475 


477 


-479 


486 488- 








490 


497- 


-498 500 


-504 


510- 


-513 516- 








519 


522 


524 526 


-528 


532- 


-534 536- 








540 


542- 


-546 548 


561 


-563 


566-567 








572- 


576 


579 581 


585 


-586 


589 593 








595 


597 


599 603 


607 


-612 


615-617 








620 


622 


626 630 


632 


-634 


638-641 








644 


647 


656-660 


662 


-664 


670 673 








678 


680- 


-682 692 


-697 


707 


709-710 








712- 


-713 


719 730 


732 


734 


736 738 








743- 


-746 


751 759 


768 


771 


773 775- 








778. 


783 


786-789 


793 


800 


803 805- 








807 


810- 


-811 814 


816 


-818 


821-822 








324 


826 


828-829 


832 


834- 


-838 842- 








845 


848 


-850 854 


-860 


862 


864 869 








871 


874 


876-879 


883 


885 


887 890- 








891 


894 


-895 898 


-9O0 


903 


908 910- 








913 


916 


919-922 


924 


926- 


-928 930- 








935 


939 


943 94 8 


-949 


951-954 957 








959 


-961 


964 969 


-970 


973 


975-978 








983- 


-984 


988-990 


992 


-993 


996-997 








1000 1002 1004- 


1013 


1016-1020 








1022-1025 1028 


1031 


1033-1034 








1038-1046 1050 


1055 


-1056 1059- 








1060 1062-1064 


1067 


-1070 1072- 








1074 1076 1078 


1082 


1086-1087 








1OB9-109O 1093- 


1097 


1099-1103 








1107 1109-1113 


1116 


-1117 1124- 








1126 1128-1131 


1134 


-1135 1138 








1140 1144-1145 


1148 


-1149 1153 








1157 1160 1163 


1171 


1183-1184 








1198-1199 1202 


120S 


-1207 1211 








1216-1217 1219 


1221 


122S 1229 








1232-1235 1238- 


1241 


1243-1244 








1246 1250 X253 


1257 


-1258 1261 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SBQ ID NOS: 



1265- 


1266 


1268 


1270- 


1271 


12 74 — 


1277 


1280- 


1283 


1285- 


1286 


1288- 


1290 


1293 


1295 


12 98 


1308 


13 12 


1317- 


1320 


13 24- 


13 25 


1327 


1329- 


1330 


1334- 


13 3 5 


13 3 8 


1342 - 


-1343 


1345- 


1347 


13 50 


1355- 


1356 


1359 


1367 


1369 


1374 


13 76 


1379 


13 98 


1400 


1406 




T A 1 A 


1 AT ^ 

JLfl A / 


1419 


1424-1426 


142 8 - 


•1431 


1434 


- 


1440- 


1442 


144 8 


1450 


1462 


-1466 


1466 


1472 


14 74 


14 78 


1487 


-1488 


1491- 


•1493 


1501- 


-1504 


1506 


1509 


1511 


1516 


1520- 


-1521 


1526 


1529 


1531 


1536- 


-153 7 


153 9- 


- 154 0 


1546- 


1547 


1549 


1552 


1555 


1557 


- 1559 


1561- 


• 1565 


1568 


1571 


1575 


1578- 


1579 


1581- 


-1583 


1587- 


-1588 


1590 


1592 


1597 


1605- 


-1606 


1611 


1613 


1615 


1618 


-1621 


1624- 


-1628 


1630- 


1631 


1634 


1636 


1638 


1641 


1643- 


1650 


1652 


-1659 


1664 


1666 


-1667 


1669 


1671 


1675 


-1681 


1683 


-1688 


1696- 


-1698 


1703 


1711 


1715 


-1716 


1719 


1722 


-1723 


1726 


1731 


-1733 


1736 


1739 


-1741 


1743 


-1744 


1749 


1755 


1760 


-1761 


1765 


1767 


-1768 


1771- 


-1773 


1776 


1779 


1783 


-1786 



Genomic clones 
from the short 
arm of 
chromosome 8 



Genomic DNA 
from 
Genetic 
Research 



EPM001 



286 686 1297 1303-1304 1352 
1411-1412 1754 



esophagus 



BioChain 



BSO002 



131-132 261 289 380 503 860 892 

1000 1007 1397 

62-63 89 112 126 194 322 336-338 
379 391 411 481 546 563 607 679 
710 867 1012 1031 1055 1251 1262 
1320 1407 1643 1652 1686 1731- 
1732 1746 1765 



fetal brain 



Clontech 



FBR001 



fetal brain 



Clontech 



FBR004 



68-69 90-91 139 212-213 301 331 
362 374 403 436 611 645-646 659 
668 670 631 785 805 845 1163 
1209 1216 1232-1233 1238-1239 
1387 1410 1416 1430 1496 1536 
1547 1S93 



fetal brain 



Clontech 



FBR006 



5-9 25 43 60 62-63 65-66 70 72 
00 87 92 101 103 108 114 136 139 
149 152-153 157 168 171-172 175 
207-208 210 212-213 221-226 237- 
238 251-253 266 272 279-281 295 
301-302 307 310 317-318 321-324 
330 333-334 336-338 346-347 352 
357 370 373 377 379-380 382 384 
391-392 397 399 402 406-408 410- 
411 417 421 424 426-427 430 436- 
437 440-443 454 460 464 467 473 
476 483 488-489 495 497 508 510- 
513 516 519-520 524 530 537-540 
544 547 550 561 567 572-574 582 
590-591 595 597 604 607-609 615 
623 628-629 631 634 638-640 655 
657-658 660 665 669 674-675 679 
689 691-694 696-627 693 701 706 
710 716 720 728 732 734 736 742- 
744 757-760 763 775-778 780 799 
806-807 810 817-818 826 839 843 
858 861 864 871-872 884 890-891 
894-895 890 904 915 921-923 935- 
936 938 945 950 952 955-956 958- 
959 961 963 967 969-971 990 992 
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Tissue Origin j RNA Source 



Hyseq 
Library Name 



SBQ ID NOS: 



fetal brain 



fetal brain 



Cl ontech 
Invitrogen 



FBRS03 



FBT002 



fetal heart 
fetal kidney 



"Invitrogen 



Clontech 



FHR001 



PKD001 



999 1001 1005-1006 1008 1013 
1016 1022 1024 1029-1030 1032 
1035 1042 1047-1048 1052 1056 
| 1065 1067 1070 10B2 1089 1109 
1114-1115 1119 1131 1143-1149 
1151 1153-1156 1160 1163 1167 
1172-13 73 1178 1184 1186 1188 
1190-1200 1211 1216 1222-1223 
1226-1227 1229 1231 1236 1245 
1253-1255 1258 1260 1262 1266 
1270-1273 1281 1287 1308-1309 
1314 1317-1320 1326 1334-1335 
1339 1341 1344 1350 1356 1369- 
1371 1373 1376 1379 1381-1382 
1386 1392 1396-1398 1419 1423 
1425-1426 1428-1429 1432 1437 
1440-1441 1448 1466 1470 1482 
1502-1503 1507 1511 1513 15Z.6 
1519 1536 1544 1549-1550 1557- 
1559 1573 1589-1590 1598 1608 
1611-1614 1619 1621 1625-1626 
1640 1651 1657-1658 1676-1679 
1693 1696 1703-1704 1713-1714 
1718 1720 1722 1724 1726 1728 
1730-1733 1735-1736 1738-1739 
1742 1745 1755 1759-1761 1765 
1767 1771-1772 1777 1779-1780 
1786 



' 235-236 520 864 1068 11BB lbBY 



15-18 20-21 24-25 29 34 43 61-63" 
77-78 98 101 103 107-108 128 130 
136 146 148 165-166 171 174 181 
IBS 196-198 204-205 208 223 230 
235-236 251 253 251 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
' 438 443-444 461 464-466 483 490 
494 509 516 519 522 527 557 561- 
562 572-573 590-591 595 597 623 
632 647-648 650 655 669-670 672 
682 690-691 700-701 710 717 736 
746 782 784 788-789 814-815 825 
829 840-841 847 8S4-855 857-858 
097_9OO 904 919 925 935-937 946 
948-949 954 960-962 966 969-970 
986 996 1000-1C01 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 1082 1085 
1090 1109 HIS 111B 1120 1128 
1136-1137 1144-1145 1149 1156- 
11S7 1193-1195 1198 1204-1205 
1220 1222 1234 1257 1262 1271 
1274-1275 1280 1285-1286 1294 
1312 1314 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
j 1358 1364 1369 1379 1383-1384 
1431 1435 1476 1507 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 1595 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 17S7 
1759-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



105 124 160 289 864 1036 H4U 
1229 1614 1616 1762 1785 



5-8 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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Tissue Origin j RNA source 



Hyseq 
Library Name 



SEQ ID NOS: 



fetal kidney 



258 277 280-281 307 310 314 330 
371 387 392 39S 403 422-423 431 
436 443 455 469 500 519 522 542 
563 572-573 585 600 619 623 650 
654 657-658 660 679 719 731 760 
798 821 833 844 854-855 857 864 
868 878 911 929 958 960 969 990 
992 1007 1046 1087 1103 1129 
1139 1285 1312 1331 1355 1369 
1371 1376 1391 1422 1425-1426 
1440-1441 1470 1543 1598 1601 
1618 1631 1651 1654-1655 1669 
1678-1679 1691-1692 1733 1785 



Clontech 



FKD002 



fetal kidney 
fetal lung 



Invitrogen 
Clontech 



FKD007 



FLG001 



fetal lung 



Invitrogen 



?LG003 



fetal lung 



Clontech 



FLG004 



fetal liver - 
spleen 



Columbia 
Uni vers! ty 



FLS001 



352 384 426-427 440 583 602 1060 
1131 1324-1325 1636 



20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



35-36 94 323 371 398 426-427 445 
473 549 560 604 616-617 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-895 1075 1178 
1182 1200 1206 1309 1311 1345 
1429 1493 1567 1576 1620 1686 



9 15-16 29 41 47 68-69 83 88-89 
102 124 137 152-153 165 196 224 
229 231 249 254 256 267 291-292 
300 325 333 344-345 352 373 376 
379 384 408 425-427 430 432 467- 
468 475 483 488 493 516 531 S35 
545 547 549 564 582 602 623 644 
660 662-664 670 673 725-726 728 
761 766-767 774 805 830 852-853 
864 875 921 932 937 946 949 963 
988-989 1014 1016-1017 1024 1027 
1090 1097 1170 1185 1200 1215- 
1216 1224 1258 1290 1309 1320 
1342 1347 13S5 1369 1381 1413- 

14 1431 1438 1449 1491 1512 
1536 1547 1557-1560 15S7 1590 
1601 1636 1644 1653-1655 1662 
1667 1671 1675 1680-1681 1706 

1 739 1760-1761 1769 

03 276 334 465-466 737 843 1131 
1614 1658 



3-11 13 15- 
Sl 54 56-58 
77-80 82-83 
110 112 116 
135-139 141 
157 163-165 
180 186 188 
200 202-206 
233-236 240 
255-256 258 
274 276-278 
293 295 299 
311 314 316 
332 342 
358 360 
386-387 390 
406 408 410 
437 439-442 
456 459 461 
487-488 490 
506 509-513 
529 531 534 
553-554 561 
576 579 581 



344 

362 



21 25 30 
60-66 6 
85 87 8 
124 126 
144 147 
167-172 
190 193 
210-214 
-244 246 
261-265 
280-281 
-301 304 
318 320 
•345 350 
370-374 
392-393 
412 415 
444-445 
470 472- 
491 493 
515-520 
536-540 
562 564 
583 585- 



-39 41-48 50- 
8-69 72 75 
9 92-103 105- 
-127 130 133 
-149 152-153 
174 176-178 
-194 196 198- 

219 221-231 
-247 250-251 
268-269 272 
284-286 288 
306-307 309 
•321 326 329- 
352-353 356- 
376 378-384 
400-401 403 
417 419 422- 
448 452-454 
479 481-483 
500-501 503- 
522-524 526- 
542 547-549 
567-568 571- 
597 599-605 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID KOS:" 



fetal liver- 
spleen 



Columbia 
University 



FLS002 



607 610-613 615-621 623-624 O^b 
62B-634 636-640 644 647-650 655- 
660 665 669-670 672 674-675 678 
681-682 684 690-695 697 702 708- 
710 713-714 716-729 725-728 730- 
731 734 736 73B 740-741 743-746 
748 750-751 759-766 768 772 774- 
777 779 783-788 793 796 798 800- 
805 808 810-812 814 818-819 821- 
8^4 826-832 834-837 843-847 849- 
867 869-876 878-883 887 889-895 
897-898 902 904-914 916 919 921- 
928 930-937 939 945-950 953-958 
960-961 963-965 967 969 971 974- 
978 980-983 986 988-990 992-993 
995-997 1000-1002 1004-1008 1012 
1014 1016-1019 1025-1026 1028- 
1031 1033 1035-1036 1039-1044 
1047 1049-1050 1053-1056 1058- 
1059 1061-1064 1067-1070 1072- 
1074 1076 1078 1082 1085-1087 
1089-1090 1097 1099-1103 1107- 
1113 1125-1119 1121-1123 1125 
1127-1128 1131-1134 1136-1137 
1144-1150 1153 1159-1160 1163 
1170 1175 1177-1178 1188 1190- 
1192 1195-1200 1202 1206 1208- 
1211 1214 1216 1218 1221-1222 
1225 1227 1234 1237 1241 1244 
1246-1247 1251 1254 1258 1261 
1266 1268 1270-1273 1277-1282 
1284-1285 1287-1290 1294 1299- 
1300 1306-1308 1313-1320 1324- 
1325 1327 1330 1332-1333 1338 
1341 1343 1345-1347 1349-1350 
1353-1360 1362-1363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 1383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-1437 1439-1442 1445-1446 
1448-1449 1454 1458-1459 1466- 
1470 1472 1474 1477-1478 1480 
1482 1485 1491-1493 1496-1498 
1501-1507 1509 1511-1512 1516- 
1519 1524-1S26 1529 1532 1536- 
1541 1546-1547 1549-1550 1552- 
1554 1562 1564 1569 1572 1574 
1575 1578 1581 1583 1587-1588 
1591-1592 1594-1595 1597-1598 
1600-1604 1611-1612 1614-1615 
1617-1618 2620-1622 1624-1625 
1627-1628 1630-1632 1634-1639 
1645-1651 1653-1662 1664 1657- 
1669 1671 1673-1674 1676-1688 
1690 1696 1701-1703 1706-1709 
1711 1713-1714 1718-1719 1722 
1724-1727 1731-1733 1738 1740 
1741 1743-1744 1746 1748 1751 
1752 1754 1760-1765 1767-1773 
1780 1783-1786 



3-11 13 15-21 26 29 32 35-39 42 
44-45 48 50-51 54-55 57-158 61 54 
68-69 73-75 78 B0 82 84 87 95-98 
100 103 105 107-108 110 112-113 
116-119 122-125 128 130 137-138 
145 147-153 155 157 159 161-163 
166 168 171-172 174-175 177 181 
188-189 1^-194 196-198 200-203 
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BNSDOCID: <WO_ 



_0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



Tissue Origan 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



206 212-215 219-221 223 225-229 
231-232 240-244 246-247 250-251 
258-259 262 264 268-269 272 275 
277 280-281 284 286 288 290-292 
295 298-299 301-304 306 308-310 
318 320-321 323 325 329 331 334 
342 348-349 352-353 356 359 368 
371 374 376-379 381-384 386-387 
392-393 397-398 400-401 403 410- 
413 421 423 426-427 429-430 433- 
436 438 440 443 445 448 451-452 
454-455 460-463 465-467 469 471- 
473 475-476 478-479 481-483 487 
490-491 493-494 497 500-501 SOS- 
SOS 509-513 515-517 519-520 524 
526-531 534 537-542 544 547 552- 
554 556 558 561-562 564-567 571- 
577 583-587 590-591 593 595 597 
601 604-606 608-613 616-617 619- 
624 626-632 634 637-642 644 647 
649-652 654-659 662-665 669-672 
674-675 681-682 685 688 690 696 
698 700-703 707 709-710 713 717 
719-721 723-724 728 731-732 734 
737-738 742-745 748 752 754 759 
763-766 76B 770 773-777 780 782 
784 786 791 795-798 801-802 805 
808 811-812 818 823-824 826-827 
832 834-837 839 843 846 848-856 
358-861 865 867 869 871 873-874 
B76 878 881-882 887 889 892 894- 
898 901-902 904 906-908 913-915 
919 921-924 926-932 934-935 937 
939-941 943 946-947 950 953 958 
961 965-967 971 973-975 977-979 
981 984-985 990 992-993 995-997 
999 1001 1004-1007 1009-1011 
1013 1016 1020 1023 1025 1027- 
1031 1033-1035 1039-1042 1044- 
1045 1049 1053 1055-1056 1058- 
1059 1062 1064-1065 1067-1070 
1072-1074 1079 1082 1087 1089 
1093 1097 1099-1103 1105-1107 
1109-1114 1123 1125-1127 1132- 
1134 1140 1143-1145 1148-1150 
1156 1158 1160 1163 1172-1173 
1177-1178 1181-1184 1190-1192 
1195-1197 1199 1204 1206 1208 
1211 1214 1216 1219 1227 1230 
1234-1235 1237 1240-1241 1243 
1245 1247 1256 1258 1260-1261 
1264 1268 1270-1271 1275 1278- 
1279 1284-1286 1288-1289 1299- 
1301 1306 1308 1312 1314 1317- 
1319 1323-1325 1327-1330 1334- 
1335 1339 1343-1347 1349-1350 
1354-1355 1357 1360 1362-1363 
1365-1367 1369 1372 1376 1378- 
1380 1386 1389-1391 1394 1400 
1403 1406 1409 1416-1419 1422- 
1427 1429 1435 1437-1438 1440- 
1442 1446 1448-1450 1453 1460- 
1461 1468 1470 1472 1474-1475 
1478 1482 1486 1490-1493 1496 
1498 15C0-1504 1506 1508-1509 
1511-1512 1516 1518-1519 1521 
1524-1528 1531 1536-1538 1543 
1547 15S0 1554 1556 1S64 1567- 
1569 1580 1587-1588 1591-1592 
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BNSDOCID: <WO_ 



_0153312A1J_> 



PCT/US00/34263 

WO 01/53312 



Tissue Origin 



fetal liver- 
spleen 



RNA Source | Hyseq 

Library Name 



SEQ ID NOS ; 



Columbia 
University 



FLS003 



1597-1598" 1600-1601 1611-1612 
1618-1628 1530-1631 1635-1638 
1641 1646-1649 1652 1654-1659 
1661-1662 1664 1667-1669 1674 
1676-167S 1633-1684 1686-1688 
1691-1692 1699 1702 1707 1711 
1713-1714 1717 1719 1722 1726- 
1727 1730-1733 1738 1740 1743- 
1744 1748-1752 1758 1760-1761 
1763-1764 1767 1769 1772-1773 
1 1776 1779 1783-1766 
103 300 318 321 3b2 

384 392-393 403 422 

435 440 444 453 S03 

978 1064 1324-1325 

1357 1369 1378 1418 

1646 1649 1680-1681 

1717 1743-1744 1769 



372 379 381 
424 429 434- 
515 544 592 
1327 1333 
1424 1622 
1689-1690 



fetal liver 



Invitrogen 



FLV001 



fetal l iver | Clontech I FLV002 
" fetal liver Clontech TUV004 



fetal muscle Invitrogen FMS001 



15-16 26 "34 58 61 64 70 75 7» b3 
98 105 112 116 120-121 123 133 
151 166 176 180 194-196 198 200 
204-206 210-211 220 225-226 230 
235-236 239 247 259 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 395 408 412 414 419 429 434- 
435 441-442 465-466 490 494 504- 
506 509 522 527 534 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 809 817 829 037 857 861 B72- 
873 875 881 889 894-89S 909 911 
916 954 963 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
1008 1014-1015 1020 1042-1043 
1070 1086-1087 1089-1090 1118- 
1119 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
; 1344-1345 1349-1350 1355 1362- 
1363 1403 1405 1415 1419 1425- 
1426 1429 1431 1442 1448 1463- 
1464 1469-1470 1489 1S2B 1536 
1539 1549-1550 1557-1562 1577 
1583 1S98 1601 1611 1615 1622 
1544 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



676 998 1719 

93 133 214 301 3bb J/4 379 555 
581 601 679 837 847 859 1123 
1236 1270 1313 1324-1325 1327 
1355 1367 1425-1426 1536 1690 

1733 1760-176 1 

26 37-39 50-51 58 84 86 89 



186 

256 



113 128 131-132 139 155 172 
194 198 201 206 211 230-231 
261 276 282 286 302 325 359 361 
375 379 383 39B 412-413 419 430 
435 448 452 462-463 473 477 503 
519 529 561 569-570 590-591 597 
607 623 626 635 647 660 672 715 
725-726 730 733 761 775-777 788 
S26 837 860 874 913 915 921 935 
970 980 986 988-990 992 1000- 
1001 1007 1014 1027 1035-1036 
1045 1060 1064 107O 1083 1097 
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BNSDOCID: <WO. 



_0153312A1_I_> 



WO Ul/53312 



PCT/US00/34263 



Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1099- 

1173 

1266 

1324- 

1383- 

1433 

1557- 

1632 

1712 

1766 



1102 
1198 
1270 
1325 
1384 
1505 
1559 
1644 
1725- 



1116- 

1208 

1277 

1329 

1399- 

1514 

1562 

1650 

1726 



1117 

1228 

1298 

1336- 

1400 

1542 

1589 

1652 

1743- 



1121 
1240 
1317 
1337 
1403 
1551 
1599 
1671 
1744 



1164 
1258 
-1320 
1369 
1409 
1554 
1620 
1675 
1754 



fetal muscle 



Invitrogen 



FMS002 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-Z.441 1468 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



fetal skin 



Invitrogen 



FSK001 



1 4-11 15-16 20-23 25 29 33 40 
43 46 56-57 60-61 64-66 75 82 87 
97-98 105 107-108 113 118-119 
123 133 135-137 139 144 146 148 
151-153 156 163 170 176 180 188- 
189 197-198 200 202-203 210 218 
222 231 246-247 261 263 265-270 
277 285-286 290 293 299 301 307 
311 321 325 328 330 333-335 339 
341 345 351-352 355-356 358-359 
362 368 370 372 376 379-382 384 
388 394 404-405 408-409 411-412 
419-420 424 426-427 436 441-442 
445 448-449 454 462 465-466 472 
476 490 493 504 506 509 515-517 
519 526 531 537-540 547 549 560- 
561 567 572-573 581 584 589 611- 
612 615 623 630-631 635 647 649 
651 657-658 660 662-66S 667 669 
672 676 678 681 688 701 704-705 
709-710 713 717 720-721 725-726 
728-729 732 748 750 753 759 764 
766 770 775-777 780-781 786 7B8- 
789 798 809 811 814 816-817 822 
824-826 831 842 857 859 861 863- 
864 881 894-895 908 910-911 916 
918 922-923 928 932-933 935 937 
946 948-949 953 960-961 966-967 
970 975 977 986 990 992-993 999- 
1000 1004 1007 1013 1018 1025 
1027 1032 1035 1041-1043 1054 
1057-1058 1060 1062-1064 1069 
1072 1077 1090-1091 1097 1099- 
1103 1108 1113 1119 1123 1128 
1131 1134 1140 1148-1149 1152- 
1153 1156 1163 1167 1178 1182 
1189 1192 1195-1196 1198 1201- 
1205 1208 1211-1212 1216 1219- 
1220 1222 1225 1240 1243 1258 
1266-1267 1274 1277 1280 1282- 
1285 1299 1310 1317-1322 1324- 
1325 1329-1330 1342 1344 1346 
1349-1351 1354-1357 1365-1366 
1369 1371 1373 1376 1378 1380 
1383-1384 1387 1399-1400 1405 
1410 1427 1429 1431 1433-1435 
1439-1441 1448-1449 1454 1457 
1468 1470 1472 1475 1480-1481 
1487 1490-1491 1493 1498 1509 
1512 1521 1525-1526 1529 1535- 
1536 1547 1549 1S57-1SS9 1588 
1592 1595 1597-1598 1601 1603- 
1604 1608 1611 1614 1618 1624- 
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BNSOOCID: <WO_ 



_0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



Tissue origin | RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



l"62'6 1632 1634 1636 1641 1643- 

1644 1646 1654-1657 1660-1662 

1665 1668 1675 1685 1687-1689 
1702-1703 1709-1710 1716 1719 

1724 1727 1731-1732 1737-1740 

1742 1747 1749 1755 1760-1761 

1765 1772 1776-1777 1779-1780 
178* 



fetal skin 



Invitrogen 



FSK002 



fetal spleen 
umbilical "cord 



BioChain 



FSP001 



BioChain 



FUC001 



13 286 302 307 313 321 330 335 
339 341 354 370 372 385 400 402 
408 414 426-427 433 436 4S0 454 
515 544 585 598 767 810 845 939 
1076 1109 1X55 1317-1320 1326 
1333-1335 1343 1347 1350 1369- 
1371 1377-1378 1391 1397 1422 
1466 1647 1656 1678-1679 1687- 
1688 1693 1718 1721 1725 1731- 
1732 1739 17-55 



110 137 211 353 589 927 1108 
1639 1771 



8 10 12 14 17 33-36 44-46 57 
64 68-69 75 82 85 101 104 113- 
114 116 119 122-124 133 137 153- 
154 157 161 163 166-167 175 181- 
184 186 192 197-198 200-202 212- 
215 230 234 246-247 251 256 263 
267 271-272 280-281 284 295 301 
314 317 321 326 333-335 345 351 
356 358 371-373 379-380 386 390 
392 394 406 408-410 412 414 416 
420 424 427 430-436 438 444-446 
454 459 461 463 467 473 482-483 
486 488 490 495 504 509 524 526 
537-540 547 555 561 574-577 588- 
591 593 606 615 620-621 632 637 
645-647 650 659-660 662-664 667- 
668 674-675 684 687 696 698 701 
703-705 709 711 714 719-720 725- 
727 732 749-750 762 765 771 775- 
777 780 789-791 793 796 802-803 
814-817 822 833 843 845 848 858 
861 864 875 879 888 B94-895 897- 
900 903 906-907 911-912 925 930- 
933 936 940 948 953 960 966 977 
984 990 992 998 1000-1001 1005- 
1007 1016 1023 1025 1037 1046- 
1047 1059 1061-1063 1073 1076- 
1077 10B9 1094-1097 1112-1113 
1115 1134 1144-1148 1151 1154 
1156 1163 1171 1197 1204-1205 
1208 1216 1218 1224 1234-1235 
1243-1244 1246 1279 1283 1286- 
1287 1298 1316 1320 1344 1346 
1350 1357 1359 1371 1373 1375 
1381 1398 1400 1403 1408 1414 
1424 1427-1428 1431 1433 1440- 
1442 1446 1454-1455 1479 1482 
1484-1485 1489 1492-1493 1504- 
1505 1513 1525 1527 1536 1538 
1546 1565 1567 1571 1573 1575- 
1576 1578-1579 1591 1595 1600- 
1601 1608 1612 1615 1621 1624 
1626 1636-1637 1647-1648 1651 
1653 1656 1658 1661-1662 1672 
1675 1682 1684 1686-1688 1690 
1709-1710 1722 1727 1729 1735- 
1738 1740-1741 1760-1761 1768 



fetal brain 



GIBCO 



4 9 11-13 17-18 22-23 25 37-33 
42-47 50-51 54-S5 58 60-61 65-66 
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PCT/USOO/34263 



Tissue Origin 



RNA Source 



Hyseq 
I»ibrary Name 



SEQ ID NOS: 



72 75 77 80 82 85 90-91 94 100- 
102 107 110 112-116 118-119 122- 
123 126 128 134 136-140 147-148 
153-155 157 161 165 169-172 175 
181 186 188-189 197-198 204-206 
208 210 215 222-223 225-226 230 
235-23B 240-241 247 253 256-258 
260-262 267-269 276 279-281 284 
286 289 298 300-302 307 310 318 
321-323 325 330-331 339 341 346- 
349 352 354 356-359 362 364-365 
371-372 377 379-380 382 384 387 
390 400 408 414-416 419 424 431 
434-43S 438 441-443 449 451 453- 
4S5 457-463 470 472-473 475 477- 
478 482-483 486-488 490-491 493 
496 499-500 502-504 506-507 509- 
512 516 519-520 522 525-526 529- 
530 537-540 543-544 546-547 566- 
567 569-570 572-582 585 588 590- 
591 593 595 599 601 604 606-609 
611-612 614-620 622-624 630 632 
636 643 645-647 650-652 654 659 
661 665 667-668 670-672 676 678 
681 687 689 692-694 697 699 710 
714 717 721 727 729-732 734 736 
738 743-746 750-751 759 763 766 
770 772 775-777 784 789 791 796 
799 802-805 810-811 814 819-821 
824 826 830 834-837 839-850 854- 
856 858-860 862 864 869 871 876- 
877 879 883 886-887 890-891 B93- 
895 898-901 905 908-910 912-916 
919 922-923 925 927 930-933 935- 
938 948 952-960 963-964 967 969- 
972 975 978-979 981 983 986-987 
990 992 995 997 999-1002 1005- 
1005 1011-1013 1016 1018-1019 
1023 1026 1029-1031 1033-1035 
1038 1041 1047 1050 1053 1057 
1059 1064 1068 1070 1072-1073 
1078-1079 1081-1082 1086 1089 
1094 1097 1103 1107-1109 1113- 
1115 1121-1122 1127 1134-1135 
1138 1140 1143 1148-1151 1153 
1156-11S7 1159 1167 1170 1175 
1193-1194 1200 1202 1207-1209 
1211 121G 1219-1220 1226-1227 
1229 1232-1234 1240-1241 1243 
1246 1249-1251 1253-1254*1258 
1267-1268 1271 1276 1279 1282 
1285-1289 1293-1294 1305 1307- 
1308 1312 1316 1320 1327 1338- 
1339 1341-1344 1346 1349 1355- 
1357 1359 1365-1366 1369-1370 
1373-1375 1379 1386 1389 1394 
1398 1409 1413-1414 1416-1417 
1420-1421 1425-1427 1430 1433 
1437 1439 1442 1445-1452 1454- 
1457 1459 1463-1464 1468 1470 
1474 1477-1479 1489 1492 1494 
1497-1498 1501-1503 1S07 1509 
1511-1513 1517 1520-1521 1524- 
1526 1531-1533 1535 1537-1538 
1S47 1554 1556-1559 1564-1567 
1571 1584 1587 1589 1594 1559- 
1601 1611-1612 1614-1616 1619- 
1620 1625-1628 1630-1631 1634 
1637-1638 1640-1643 1645 1648- 
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Tissue Origin 










SEQ ID NOS: 










Library Name 
























1649 


1651 16 


53-1655 


1657 


-165 


8 








1664 


-1665 1667 1669 


1673 


1678- 








1679 


1683-1684 1686 


1693 


1701 








1704 


-1705 1709 1713- 


1714 


1717- 








1720 


1724 1727-1728 


1731 


-1733 








1737 


-1738 1743-1744 


1752 


17S4- 








1755 


1757 1760-1761 


1765 


1772 








1779 


1785 












macrophage 


Invitrogsn 


HMP001 


5-8 


110 


204- 


205 


503 


634 


678 


859 








87R 


933 


988- 


989 


1379 


144 


8 1504 


infant" hrain 


Coliiiri>i a 


IB2002 


10 12-13 


15- 


18 22-23 


25 


29 3 


4 








37-3 


9 43 


47 


50-51 54 


-56 


58 60-63 








65-66 68 


-69 


72-74 80 


82 - 


83 86 








88-92 97 


100 102-104 


106 


-108 110 








112- 


113 


115- 


116 


118 


123 


128 


130 








134- 


136 


138- 


139 


143 


14 7- 


149 


151- 








152 


154- 


155 


163 


165- 




169 


172- 








175 


181- 


184 


186 


193- 


196 


198 


201 








203- 


205 


209- 


210 


214- 


215 


222 


224- 








226 


231- 


232 


235 


-236 


239 


246- 


-247 








252 


257 


260 


268 


-269 


272 


276- 


-277 








279- 


281 


286 


288 


291- 


292 


295 


298 








300- 


301 


304 


307 


310 


313 


321- 


-323 








330- 


331 


333- 


•334 


339 


346- 


347 


349 








352 


356- 


•357 


362 


371- 


372 


377 


379- 








380 


383- 


384 


392 


397 


401 


406 


408 








411 


413- 


414 


416 


418- 


419 


422 


428 








430- 


431 


434- 


-435 


438 


443 


449 


453- 








4S4 


461 


464- 


-466 


469-470 


472- 


-473 








475- 


476 


478 


482 


-483 


487 


490 


4 92 








494 


497 


503 


507 


-508 


510- 


513 


516 








519- 


520 


524 


-526 


530- 


-534 


53 6 


-540 








547 


550- 


-551 


561 


563- 


-564 


566 


-567 








572- 


576 


579 


581 


-582 


584- 


•507 


590- 








591 


593 


595- 


-597 


607- 


-609 


611 


-613 








616- 


617 


620 


622 


-624 


627 


631 


637 








641 


645- 


-647 


650 


-655 


657- 


-658 


660- 








665 


667- 


-675 


689 


691 


695 


697 


699 








703 


707 


713 


-715 


717 


721 


728 


-731 








733- 


•736 


739 


743 


745 


751 


755 


759 








763 


769- 


-770 


772 


778 


780- 


-781 


785 








788- 


789 


793 


-794 


799 


803 


808 


811 








814 


825- 


-826 


830 


834-836 


840 


-843 








845 


848-850 


854 


-855 


860 


862 


864- 








865 


870 


872 


875 


-876 


878 


886 


883 








890- 


-891 


894 


-896 


898 


903- 


-904 


916- 








917 


919 


922 


-925 


927-928 


930 


-932 








934-936 


938 


941 


945- 


-946 


948 


-950 








953- 


•954 


959 


-962 


966- 


-969 


977 


979 








981 


986- 


-990 


992 


997 


999- 


-1000 








1004-1006 1014 


1016 


1018-10 


19 








1024-1025 1033 


1036 


1047 1051- 








1052 1054-1055 


1057 


-1059 1063- 








1064 1068-1070 


1073 


1081-1082 








1085 1089 1108- 


1113 


1118-1120 








1123-1124 1130 


1132 


-1138 1140 








1149 1151 1153- 


1154 


1163-1170 








1172 1174-1175 


1183 


-1184 1188 








1190 1193-1194 


1196 


-1197 1199 








1204 1208-1209 


1211 


1218-1222 








1226-1227 1229 


1231 


1234 1241 








1247 1249 1251 


1256 


1258 1261- 








1262 1269 1274 


1279 


1281 12 


83 








1285 1287-1289 


1294 


-1295 13 


05 








1307 1313-1314 


1316 


-1320 1329 








1332 1341-1342 


1345 


1349 1356 








1362-1363 1365- 


13 66 


1368-13 


70 








1374 1381 1383- 


1384 


1388 14 


00 








1403 1406-1407 


1413 


1417 14 


20 
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BNSDOCID: <WO. 



_0153312A1J_> 



WO 01/53312 



PCT/USOO/34263 



Tissue Origin 



infant brain 



RNA Source 



Columbia 
University 



infant brain - 



infant brain 



Hyseq 
Library Name 



SEQ ID NOS: 



Columbia 
University 



Columbia 
University 



1423 1429-1431 1435-1436 1439- 
1441 1443 1447-1449 1451-1452 
1454-1455 1457 1459 1463-1465 
1468 1470-1471 1475 1479 1482- 
1483 1485 1493-1494 1496 1490- 
1499 1502-1503 1505-1507 1509 
| 1522-1523 1525 1528 1531-1533 
1542 1546-1547 1549-1550 1554- 
1555 1563 15S5-1567 1569 1575 
1580 1583-1586 1588 1590 1592- 
1593 1595 1598 1600-1601 1608- 
1610 1612 1614-1616 1619 1621 
1624 1626-1627 1630-1633 1637 
1639-1640 1642 1644 1647 1652 
1654-1655 1658-1659 1664--665 
1672-1673 1676-1681 1685-1688 
1693-1695 1701-1702 1704 1708 
1717-1720 1723-1724 1726-1728 
1733 1735-1741 1743-1744 1752 
1755-1758 1762 1765 1771 1774 
1777-1778 1786 



^B2003 | 17-18 20-23 29 34 43 60 €8-69 

78-80 88 100-101 107 110 112 118 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 225-226 229 235-236 247 260 
276-281 286 290-292 295 300-301 
310 322 324 331 334 339 346-347 
349-350 352 357 371 376-377 382 
384 403 408-409 414-415 453-455 
472 476 478-479 490 503 S07 516 
520 530 534 536-540 5S1 563 572- 
576 585 587 590-591 593 595-596 
601 606 612 616-617 620 622-624 
650 652-653 661 665 670-671 674 
675 678 689 715 717 727-728 730 
734 759 775-777 780-781 785 796 
806-807 811 824 845-846 864 869 
875 882 889 894-895 898 .904 917 
919 921-923 932 935-936 946 950 
954 962 977 979 997 999-1000 
1005-1006 1009 1011 1017 1024 
1033 1037 1043 1055 1057 1109 
1114-1115 1120 1123 1127 1144 
1145 1149 1151-1153 1160 1167 
1170 1174 1193-1194 1196 1199 
1202 1206 1209 1220-1221 1226 
1229 1240-1241 1251 1258 1284 
1288-1289 1305 1314 1327 1333 
1344 1347 1350 1356-1357 1365- 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 1440-1441 
1446-1447 1457 1459 1471 1499 
1503 1507 1509 1535 1546 1557 
1559 1567 1572 1587 1595 1598 
1610-1612 1615 1631 1639 1644 
1647 1657-1658 1673 1678-1681 
1683-1634 1701-1702 1708-1709 
1713-1714 1719 1757 1760-1761 
1765 1771 1778 



IBM002 I 101 113 139 152 260 279 290-292" 

374 377 551 563 608-609 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 
1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 

__ 1779 

IBS001 1 10 12 119 175 279 -281 321 334 

371 446 551 563 623 652 667 669 
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BNSDOCID: <WO_ 
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WO 01/53312 



PCT/USOD/34263 



Tissue Origin 



lung, 
fibroblast 



RMA Source 



Hyseq 
Library Name 



Strategene 



LFB001 



lung tumor 



Invitrogen 



LGT0O2 



SEQ ID NOS: 



671-67 2 819 949 9661113 113 0 
1151 1188 1193-1194 1196 1229 
1258 1265 1271 1287 1317-1319 
1324-1325 1342 1423 1440-1441 
1448 1471 1482 1525 1532 1546 
1562 1569 1588 1591 1610 1618 
1647 1649 1658 

5^9 17 20-21 2b feB-69 82 94 105 
153 157 197-198 203 207-208 212- 
213 223 262 266 233 302 321 326 
333 356 370 427 430 436 446 462 
472 493 498 503 516 519 527 535 
537-540 542-544 562 565 567 586 
599-600 607 615 630 647 662-664 
692-694 712 719 745 748 775-777 
794-796 810 837 843-847 849 854- 
856 869 876 903 934 953 955-956 
964 975-976 984 1000 1005-1007 
1024-1025 1033 1039 1053 1064 
1070 1072 1082 1112-1113 1134 
1136-1138 1140 1195 1223 1232- 
1233 1246 1279 1285 1295 1311 
1320 1334-1335 1343 1427-1428 
1446 1478 1482 1493 1504 1537 
1552 1555 1567 1575 1582 1598 
1620 1625 1632 1638 1645 1654- 
1655 1662 1680-1681 1684 1686 
1690 1696 1702 1711 1733 1741 
1760-1761 1778 1785 



5-10 18 20-21 29 33-36 40 43 
54-55 61 65-66 68-70 73-75 80 85 
88-89 93-94 100 103 106-108 112- 
113 115-116 118-119 123-124 126 
130-132 135-137 139-141 143-144 
147-148 151-153 155-156 159 161 
164 169 171 179-180 185 190 192 
194 196-199 203-208 210 212-214 
216-217 219 222 233 240-241 244 
246 251-252 255-256 261-262 266. 
272 276-277 279-281 284 286 288 
290 295 298 301-302 309-312 317 
321 329 332 341-342 344-345 348 
352 358-360 363 368 370-371 376 
380-381 384 389-390 398 400 409 
414 423 426-427 430 432-436 443- 
444 450-451 454 462 468 472-477 
480-483 487-488 490-491 493 496- 
498 500 503-506 509-512 515-516 
519 521-523 526 530 534 541 544 
547 554 557 564 566-567 572-576 
585-586 588-589 595-596 601 607 
611-612 615 619 621 623 626 630 
632-633 644 647 649 651 655-656 
660 662-665 667 659 672 683-684 
696 700 706 710 713 716 718-719 
722-723 728 734-739 743 750 752 
763 765-766 773-778 784-785 787- 
789 791 BOO 802-803 809-812 814 
824 826 828-829 832 838-839 841- 
845 849-850 852-855 857-861 864 
866 874 878-880 882 887 890-891 
897-898 902 904 906-907 910 916 
918-920 922 924-925 927 930-932 
934-935 937 947 950 953 955-956 
961 963 966-967 969 971 977-979 
981 984 986*987 990 992-993 995 
997 999-1001 1005-1007 1009 
1012-1013 1018 1020 1022-1024 
1026 1029-1030 1033 1038 1041 
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Tissue Origin 



lymphocyte 



RNA Source 



Hyseq 
Library Name 



ATCC 



LPC001 



SEQ ID NOS: 



104S 
1059 
1074 
1097 
1116 
1139 
1152 
1172 
1202 
1222 
1257 
1278 
1289 
1317 
1344 
1357 
1383 
1403 
1431 
1448 
1470 
1488 
1508 
1519 
1540 
1561 
1591 
1602 
1624- 
1644- 
1656- 
1671 
1685- 
1705 
1730 
1748- 
1767 
1778- 



1047 
1063 
1078 
1104 
1117 
11*41 
1153 
1178 
12 04 
1227 
-1258 
1280 
1295 
-1321 
■1346 
1365 
1385 
1408 
1433 
1454 
1474 
1490. 
1509 
1523- 
1546 
1565 
1593- 
1608 
1625 
1645 
1662 
1673- 
1688 
1709 
1735 
1749 
1770- 
1779 



-1050 
1064 
1085 
1106 
1119 
-1142 
1156 
1195 
1208 
1234 
1265 
-1281 
1300 
1329 
1349 
1366 
1394 
1417 
1436 
1455 
1480- 
1491 
1511- 
1524 
1549- 
1567 
1594 
1614- 
1627- 
1647- 
1664 
1675 
1690- 
1716- 
1739 
17S3 
1771 
1786 



1052 
1067 
1087 
1107 
1126 
1144 
1158 
-1196 
1214 
1241 
1267 
1283 
1305 
1338 
1351 
1369 
1397 
1419 
1438 
1460 
1481 
1494- 
1512 
1528- 
1550 
1569 
1596- 
1616 
1632 
1649 
1666^ 
1678- 
1692 
1717 
1741 
1760- 
1773 



1054 
-1071 
1089 
1109 
1134 
-1145 
1167 
1198 
1216 
1247 
•1270 
1285 
1308 
-1339 
1353 
1378 
1400 
1423 
1444 
1466 
1483 
1496 
1515- 
1529 
1555 
1575 
1598 
1618 
1636 
1652- 
1667 
1679 
1695- 
1722 
1743- 
1762 
1775 



1055 
1073- 
1095- 
1112 
-1135 
1148 
1170 
-1200 
1219 
1252 
1276 
1288- 
1312 
1341 
1355 
1379 
1402- 
1426 
1446- 
1468 
1486- 
1506 
1516 
1536- 
1560- 
1588 
1600- 
1620 
1639 
1653 
1670- 
1683 
1699 
1727 
1744 
1765 
1776 



4 11-12 18 24-25 30-31 48 50-51 
56-57 68-69 80 92 98 103 105 110 
126 137 152-153 157 165 172 188- 
189 197 203 210 217-218 222-223 
225-226 229 231 247 251 256 264 
272 280-281 284 300-301 321 325- 
326 339 348 352 357 371 382 384 
390 400 404 412 414 421 423 426- 
427 430-431 445 447-448 451 454- 
455 475 503 516 526-527 530 537- 
540 549 556-560 563 S74 577 589 
602 613 615-617 621 623 628-630 
636-637 647 649 657-659 690 697 
717 723 755 764 775-777 780 786 
789-790 793 800 802 822 838 849 
866 869 876 881-883 892 898 906- 
907 911 921-923 928 975 990 992 
996 1001 1004-1007 1033 1050 
1054 1078 1107 1135 1140-1141 
1143 1148 1158 1163 1177 1199 
1205 1216 1226 1231 1236 1241 
1244 1250 1258 1260 1265 1269- 
1271 1290-1293 1308 1312 1317 
1319-1320 1339 1345-1346 1348 
1350-1351 1357 1367 1369 1379 
1381 1383-1384 1386-1387 1389 
1394 1397 1405 1423 1425-1428 
1431 1437 1446 1448 1461 1466 
1470 1472 1474 1482 1492 1506 
1528 1537 1546 1549 1591 1598 
1600 1603-1604 1606 1627 1636 
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BNSDOCID: <WO_ 



_0153312A1_t_> 



PCT/US00/34263 

WO 01/53312 



" Tissue Origin 


RNA Source 1 

[ 1 


Hyseq 
Library Name 


SEQ ID NOS: I 






3 
1 
3 

1 


638 1647-1649 l&bl lb5B-l6 3 9 | 
664 1676-1677 1680-1681 1687- J 
.680 1699 1711 1715-1716 1726 1 
1728 1737 1740 1746 1748 1752 
1756 1758 1777 1779 


1 leukocyte 


GIBCO 1 


" X.UC001 


J-4 10-11 13 15-18 20-21 24-2i, 
)0-31 35-36 40 43-45 48 50-51 
54-58 60-63 68-69 75 79-80 82-83 
95 88-91 93-96 98 100 103-104 
107-108 112 116 119 123 125-128 j 
134-140 142 147-149 151 153 155 
157 162-163 167 169-172 174 177- 
179 186 190 192-199 203-207 210 
212-215 217-219 222-223 229 235- j 
236 247 251 255-258 260 262 272 
274-277 280-281 285-286 297-301 
307-310 313-314 316-317 321 325- 
330 333-334 340-342 348-349 352 1 
354-358 370-371 380-385 387-388 
400 405 408-410 412 414-416 421- 1 
425 430-431 434-435 437 439 441- 
442 445-451 453-454 456 459 461- 
464 468-472 474-479 481 483-485 
487-491 496 499-501 503-504 509- 
513 516-519 522 526-527 529-531 
534 536-540 542 547-549 553-559 j 
566-567 571 574-S77 579 582 584- 
586 589 593 595-597 601-602 604 j 
606-607 611-613 615-621 623 627- 
629 633 636-637 642 644-650 65S 
659-660 662-665 667 669 674-675 
678 682-684 692-696 698 700 706 
708 710 716-720 725-726 729-736 
738-739 743-746 749 751 753 756 
759 765-766 768 770-773 780 784- 
786 788-790 793 796 793 800 802- j 
803 810-811 814 817 819 826 828- J 
830 832 834-836 838 843 845-860 1 
863-864 866-871 877-879 881-892 
894-896 898 902 904-914 916 919- 
925 927 930-932 935-936 941-942 
945 948-949 953 9S5-956 958 960- 
962 964 967 970-971 973 975 977 I 
985-990 992-993 995-996 999-1002 
1004-1009 1011 1014 1017-1019 
1022-1023 1025 1027 1029-1031 
1033-1036 1038 1041 1043 1047 
1050 1053-1054 1058-1059 1061- 
1062 1064 1068 1070 1072 1078 
1085-1086 1089-1091 1093 1097 
1106-1107 1110-1113 1115-1117 
1122-1123 1125 1129 1132-1133 
1135-1137 1140-1145 1152 1158 
1163 1168 1170-1174 1176-1178 | 
1180 1182-1183 1186 1195 1198- 
1200 1202 120S-1206 1211 1216 j 
1219-1221 1223-1227 1230-1236 
1238-1242 1247 1252 1254 1256 | 
1258 1261-1262 1264-1265 1269- 
1270 1272-1275 1277 1280-1284 
1287-1293 1299-1300 1306 1308 
1312-1313 1317-1320 1322 1324- 
1330 1333-1335 1339 1341 1343- 
1347 1349 1353-1357 1359-1361 
1365-1367 1369-1370 1373-1374 
1377 1379-1381 1386-1387 1394 
1400 1403 1409 1419 1423 1425- j 
1428 1430-1431 1433-1434 1437- 
1438 1440-1442 1446-1448 1450 J 
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0153312A1 I > 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1453 

1470- 

1488 

1506 

1521- 

1531 

1549- 

1565 

1594 

1608 

1626- 

1639 

1653 

1670 

1692 

1711 

1727 

1744 

1762 

1784 



1458- 
1471 
1490- 
1509 
1522 
1534 
1550 
1567 
1596 
1611 
•1629 
1641 
-1655 
1675- 
1696 
1716- 
1733 
1748- 
1765 
1786 



1459 

1474 

1493 

1512- 

1524- 

1538 

1553 

1575 

1598 

1614 

1631- 

1644- 

1658 

1679 

1700 

1717 

1737 

1749 

1769 



1463- 
1477- 
1496- 
1513 
1525 
1541 
1555- 
1580 
1600- 
1620 
1632 
1645 
1660 
1684 
1702 
1720 
•1738 
1752 
1771 



1464 
1478 
1501 
1516 
1527- 
1545- 
1556 
1589 
1602 
1621 
1636 
1648- 
1662 
1688 
1707- 
1723 
1741 
1755 
-1772 



1468 

1482- 

1504 

1519 

1528 

1547 

1560 

1591 

1606- 

1624 

1638- 

1650 

1669- 

1690- 

1709 

1725- 

1743- 

1760- 

1781- 



4 35-36 44-45 61 68- 
119 139 154 179 197 
324 372 404 430-431 
477 481 503 537-540 
581 589 608-609 621- 
632 647 662-664 669 
773 775-777 802 848 
879 905-907 915 949 
1002 1113 1119 1170 
1236-1237 1241 1275 
1357 1359 1377 1506 
1553 1591 1600 1613- 
1628 1670 1676-1677 
1699 1733 1738 1772 



leukocyte 



Clontech 



LUC003 



69 75 82 1 
244 280-281 
455 461 476 
554 575-576 
622 624 630 
679 698 764 
851 856-857 
952 990 992 
1183 1216 
1346 1353 
1515 1534 
1614 1621 
1691-1692 



02 



melanoma from 
cell line ATCC 
fcCRI, 1424 



Clontech 



MEL0O4 



25 35-36 43 80 104 126 128 150 
163 166 188-189 197 210 215 220 
271 277 280-281 310 317 336-338 
345 3S1 372 380-381 383 387 412 
415-416 430 445 448 454 456 467 
481 490 499 503 526 528 546 548 
567 S75-576 588 601 613 615 647 
660 665 734-73S 737 759 778 787 
790 800 832 845 856 859 869 878 
883 887 905 914 932 934 958 976 
985 990 992 999-1000 1025 1031 
1038 1050 105S 1068 1074 1088 
1099-1102 1107 1136-1138 1149 
1156 1163 1172 1190 1195 1200 
1214-1215 1217 1226-1227 1235 
1238-1239 1244 1253 1278*1230 
1293 1311 1320 1330 1334-1335 
1345 1355 1367 1386-1387 1394 
1403 1406 1414 1423 1437 1442 
1465 1521 1529 1536 1539 1541 
1547-1548 1582 1620 1626 1631 
1638 1647 1653 1660 1667 1669- 
1670 1680-1681 1696 1704 1715 
1724-1725 1731-1732 1750 1760- 
1761 



mammary gland 



Invitrogen 



MMG001 



5-8 10 12 14 
33-39 42-43 
71 73-74 79 
106 108 112 
146 148 150- 
166 170-172 
188-190 194- 
222 224 227- 
251 253-254 
271 276-277 



-18 20-21 24 
52 55-58 60- 
80 82 89 98 
123 12 8 133- 
152 154 158- 
174 176 178 
198 201-206 
228 231 233- 
2S6 261-263 
279-281 284- 



25 29 
64 68-69 
100 103 
137 144- 
159 165- 
181-185 
210 217- 
237 247 
266-267 
286 288 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



induced neuron 
cells 



Strategene 



; 290 297 299 301 304 309-312 31B 
320-321 323-325 327-329 331-332 
334 339 341 344-345 348 350 356 
359-360 362-363 368 371 376 379- 
303 380 390 393-395 397-398 405 
4C8 412 414-415 423 430 434-437 
441-444 448 451-455 462-464 474 
; 476 479 482 485-486 488 490 494- 
4S5 498 503 506 509-512 516-517 
519-520 522 527 529 534 537-541 
.547 549 554 557 562 572-574 587 
589-591 597 602 607 618 623 628- 
629 632 634-640 644 647-648 650- 
652 655 657-658 660 665 667 669- 
672 674-676 679 682 688 695-696 
706-707 710 713 717 720 722-730 
732-734 736 738 743 747-748 750 
755 759 761 766 770 7B0 784 706- 
789 794 8C3 806-807 809 ei4 817- 
822 827-829 B37 842 854-858 863- 
864 866 869-870 872 878 881 889 
893-900 904 906-907 911 916 919 
921-923 926 935-937 946 948-949 
953-954 957 960-961 963 965-966 
970 977-978 984-989 993-997 
1000-1001 1005-10C6 1008 1013- 
1014 1016-1017 1023 1025 1027 
1032-1033 1036 1039 1043 1045 
1055 1057-1058 1063 1068-1075 
1077-1078 10BS 1087 1089-1091 
1095-1102 1107-1108 11X2-1119 
1121-1123 1131-1133 1136-1137 
1139-1142 1144-1145 1148-1149 
1153 1159 1167 1170 1172-1173 
1183-1185 1190-1192 1196-1199 
1207-1208 1212 1216-1218 1222- 
1223 1225 1231 1234 1240-1241 
1247 1253-1254 1258-1259 1261- 
1262 1270-1280 1283 1285-1286 
1298 1307 1314 1316-1320 1323- 
1325 1330 1334-1335 1342-1345 
, 1349-1352 1354-1355 1359 1369- 
1370 1377 1379 1381 1383-1384 
1389 1405 1414 1419 1421-1423 
1425-1426 1428-1429 1431 1434- 
1437 1439 1448-1449 1454 1457 
1460-1464 1466 1471 1480-1483 
1487 1489-1491 1493 1505 1507 
1512 1519 1526-1528 1532 1534 
1536 1539 1542 1547 1549-1550 
1554 1561-1562 1564 1567 1572 
1576-1579 1581-1532 1587-1588 
1592 1594 1596-1597 1601-1602 
1607-1608 1610 1612-1616 1618 
1621-1622 1625-1626 1631 1635- 
1636 1641 1643-1644 1647 1650 
1652 1654-1655 1657-1658 1660 
1662 1664-1666 1669-1671 1673- . 
1674 1676-1677 1680-1685 1689- 
1692 1701 1706 1713-1715 1719- 
1720 1723-1728 1730-1732 1738 
1740 1742-1744 1746-1747 1749 
I 1751 1753 1760-1762 1765-1768 
1771 1774 1776-1777 1779 1783- 
1784 X786 



NTD001 



29 35-36 80 116 123 156 163 l«.i 

214 230 280-281 284-285 307 321 

330 340 358 371 375 377 380 382 

422 424 492 497 532-533 542 546 
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Tissue Origin f RNA Source 



retinoid acid 

induced 
neuronal cells 



Strategene 



Hyseq 
Library Name 



neuronal cells 



pituitary 
gland 



placenta 



Strategene 



Clontech 



prostate 



rectum 



Clontech 
Clontech 



Invitrogen 



NTR0 01 



NTU001 



PIT004 



PLA003 



SEQ ID NOS : 



549 566 586 595 612 645-647 6S4" 
734 775-778 780 792 199 821 826 
856 858 875 936 953 985 990 992 
1041-1043 1055 1072 1104 1193- 
1194 120G 1223 1246 1253 1274 
1288-1289 1291 1294 1311 1320 
1349 1359 1412 1423 1485 1620 
1623 1645 1684 1705 1715 1751 



b-B 78 268-269 277 383 431 506 " 
623 677 731 999-1000 1199 1425- 
1426 1547 



29 65-66 80 82 110 119 146 152 
166 174 181-185 198 227-228 253 
284 309 325 332 334 336-338 375 
391 393 406 414-416 454 465-466 
470 488 503 506 510-512 519 537- 
540 572-574 597 602 607 623 647 
661 700 702 716 743 771 792 858 
904 948 954 977 1000 1005-1006 
1025 1064 1068 1122 1148 1185 
1219 1226 1234 1246 1271 1283 
1295-1296 1311 1317-1320 1329- 
1330 1350 1355 1365-1366 1378 
1383-1384 1400 1412 1445 1505 
1539 1547 1578 1647 1656 1683 
1690 1738 1749 1783-1784 



PRT001 



RECOOr 



311 314 379 408 419 430 454 10SS 

1095-1096 1272-1273 1312 1320 

1378 1652 1671 1720 1725 1736 

1741 1755 

5-8 124 208 277 370 843 906-907 — 

1280 1317-1319 1359 1609 1621 
1737 



9 46 57 71 107 147 171 177 197 

201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-533 547 618 649 657-658 662- 
664 710 729 767 771 789 820 861 
871 874 890-891 905 938 945 963- 
964 9B8-989 1002 1025 1033 1045 
1061 1095-1096 1112 1125 1142 
1196 1198 1202 1232-1233 1241 
12S8 1272-1273 1287 1295 1313 
1333 1341 1344 1349 1360 1362- 
1363 1367 1437 1442 1447 147S 
1476-1479 1482 1489 1513 1517 
1S27 1531 1536 1598-1599 1628 
1636 1657 1680-1681 1687-1688 
1717 1738 1743-1744 



17-18 29 33 62-63 71 73-74 83 86 
113 126 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 517 537- 
540 547 567 585 589 602 623 628- 
629 632 645-647 651 657-658 669 
717-719 721 725-726 738 748 750 
756 762-763 766 770 774 790 819 
625 843 849 851 881 903 909 948- 
949 960 986 996 1O20 1023 1033- 
1034 1064 1067 1070 1075 1086 
1108-1109 1113 1130 1139 1153 
1159 1172 1178 1185 1187-1189 
1205 1220 1225 1240 1244 1271 
1317-1320 1323 1334-1335 1350- 
1351 1355 1369 1373 1375 1425- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEC ID NOS: 



142 6 1436 1439 14 69 14 74 14 7 7 

1482 1546 1587-1588 1592 1596 

1610 1622 1627 1644 1658 1662 

1665-1666 1669 1675-1677 1749 
1786 



10 55 97 103 110 140 149 152 158" 
198 217-2X8 242-243 256 301 308 
312 321 333 351 354 360 410 437 
448 473 487 494 496 501 535 555 
569-570 572-573 590-591 624 636 
651 759 762 764 768 771 788 800 
809 826 848 865 879 906-907 925 
933 963 1016 1020 1025 1040 1046 
1055 1066 1103 1150 1172 1181 
1234 1281-1282 1288-1289 1298 
1315 1320 1333 1336-1337 1346 
1359 1373 1379 1424 1447 1449 
1474 1482 1492 1494 1498 1511 
1523-1524 1537 1554 1596 1626- 
1627 1636 1652-1655 1658 1665 
1671-1672 1691-1692 



salivary gland 



Clontech 



SAL001 



salivary gland 



Clontech 



skin 
fibroblast 



ATCC 



SALS 0 3 
SFB001 



158 326 1423 1463-1464 



skin 
fibroblast 



skin 
fibroblast 



small 
intestine 



ATCC 



ATCC 



Clontech 



skeletal 
muscle 



skeletal 
muscle 



skeletal 
muscle 



skeletal 
muscle 



spinal cord 



Clontech 



SFB002 



SFB003 



SIN001 



1320 1400 



262 736 1025 1253 



709 1119 1350 1631 1653 



25 142 146-147 151 1 
244 260 271 280-281 
301-302 308 312 334 
408 412 414 416 423 
434-435 445 452 454 
519 521 523 543 547 
565 569-570 585 592 
628-629 632 650 659 
71B 750 764 780 798 
859 866 887 892 894- 
906-907 912 919 935 
1007-1008 1026-1028 
1089 1097 1116-1117 
1169 1199 1219 1234 
1279 1316 1320 1326 
1349 1351 1374 1387 
1403 1407 1423 1428 
1501 1521 1550 1556 
1636 1638-1639 1645 
1662 1671 1675 1684 
1704 1711 1717 1719 
1726 1729 1733-1734 
1762 1767 1780 1785 



SKM001 



Clontech 



Clontech 



Clontech 
Clontech 



SKMs03 



SKMS04 



SPC001 



55 198 203 
286 288 298 
340 371 398 
425-427 430 
478 503 516 
549 555 559 
604 611 626 
681 710 714 
829 842 857 
895 901 904 
997-998 1000 
1044 1055 
1131 1148 
1247 1264 
1341 1343 
1398 1400 
1468 1498 
1585 1597 
1653 1656 
1691-1692 
1722 1725- 
1743-1744 



18 20-21 82 84 101 118 134 148 
151 153 166 225-226 258 274 277 
289 329 361 412 414 424 440 452 
459 470 488 503-504 537-540 647 
660 673-675 715 773 780 786 830 
905 922 950 963 982 990 992 1020 
1047 1063 1115-1117 1121 1134 
1228 1268 1284 1298 1321 1329 
1336-1337 1343 1409 1413-1414 
1509 1599 1624 1644 1653 1712 



168 1683 1712 



235-236 1409 



235-236 



4 9 11 17 30-31 35-36 43 46 60 
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Tissue Origin [ RNA Source 



adult spleen 



CI on tech 



Hyseq 
Library Name 



SPLcOl 



SEQ ID NOS: 



82 85 92 94 108 110 116 139 157" 
167 198 204-205 210 215 229 2S6 
259 277 280-281 300-302 304 315 
317 372 379 387 392 419 426-427 
430 433 448 467 473 487 489 506 
509 513 519 524 526 537-540 543 
547 549 551 559 567 569-570 5S3 
607 516-617 623 625 637 649-650 
652 657-658 670-671 673 679 681- 
682 709 711 715 719 728-729 734 
749-750 753 775-777 782 789 791 
809 820 832 834-836 847-849 854- 
855 858 861 864 871-872 875 884 
898 906-908 917 919 924 934 942 
944 970 985 990 992-993 998 1013 
1039 1053 1059 1065 1072 1075 
1077 1082 1085 1097 1103 1109 
1116-1117 1128 1134 1151 1170 
1174 1192-1194 1215 1225 1241 
1243 1283 1294 1307 1312 1320 
1323 1327 1330 1350 1353-1354 
1356 1359 1368 1375 1400 1406- 
1407 1423 1429 1437 1443 1448 
1454 1470 1482 1492 1501 1508 
1511 1529 1538 1548-1549 1565 
1571 1578 1598 1600 1614 1625 
1627 1630 1639 1646 1651-1652 
1670 1686 1696 1740 17S1 1755 
1771 



117 312 326 348 424 426-427 431 
845 866 1320 1330 1333 1344 
1355-1357 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



CI on tech 



STO001 



thalamus 



Clontech 



THA002 



10 15-1S 61 68-69 1O0 117 149 
197 201 227-228 231 249 273 2B0- 
281 287 291-292 302 312 358 362 
426-427 430 446 462 475 479 535 
597 620 630 651 662-664 722 739 
780 782 785 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-1235 
1259 1277 1280-1281 1322 1349 
1359 1369 1449 1468 1474 1478 
1487 1493 1498 1557-1559 1622 
1634 1651 1653 1729 



thymus 



9 11 25 85 87 112 137 146 180 

190 198 206 210 212-213 235-236 
239 261 268-269 279 290 301 32S 
333-334 341 351 356 364-365 379 
388 393 396 419-420 441-442 458 
477 483 SOS 525 531 549 567 606 
608-609 647 681 715 725-727 736 
774 782 784 794 827 883 890-891 
899-900 961 997 999-1001 1004 
1034 1055 1097 1129 1144-1145 
1150-1151 1157 1172-1173 1177 
1193-1194 1208 1220 1249 1280 
1305 1345 1355 1369 1434-1435 
1440-1441 1454 1496 1546 1549 
1562 1572 1578 1590 1594 1613- 
1614 1640 1651-1652 1671 1687- 
1688 1703 1743-1744 1746-1747 
1753 



Clontech 



THM001 



44-45 54 57 
126 134 153 
243 258 274 
327 330 333 
430 445 465- 
493 503 506 



"58 62-64 79 104 123 
193 212-213 218 242- 
277 279 297 301 307 
342 351 358 371 410 
466 468 471 483 487 
509 517 526 535 537- 
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Tissue Origin 



RWA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



540 546 548 554 567 584 586 590- 
591 604 612 621 638-640 645-647 
649 656 660 655 670 698 710 720 
728 735 739 746 759 762 766-767 
775-777 780 784-785 800 802 809 
824 826 828 845 851 858-859 864 
866 870-871 878 884 887 892 899- 
900 927 930-931 967 983 986 990 
992 999 1014 1029-1030 1033 1059 
1066 1073 1103 1107 1113 1116- 
1117 1119 1140-1142 1158 1163 
1172 1177 1195 1206 1209 1213 
1216 1218-1219 1221-1222 1227 
1271 1277 1282 1320 1329 1349 
1367 1369 1383-1384 1417 1419 
1423 1425-1427 1448 1477 1488 
1493 1536 1554 1620 1644 164C 
1549 1654-1655 1661-1662 1669- 
1670 1674 1676-1677 1685-1688 
1707 1711 1731-1732 1737 



thymus 



Clontech 



THMC02 



5-9 15-21 25 33 35-36 43-45 48 
50-51 54-55 60 75 63 87 89 93 
93-100 102 105 112 117 135-137 
141 143 146 157 167 169 192 196 
211 217-219 222 224 229 233 235- 
236 240-241 244 251-252 256 261- 
262 268-269 286 288 290 295 297 
301-302 309-310 315-317 321 324 
327 334 342 350 352-353 360 370- 
373 382 384 400 403 410 414-416 
424 430-431 436 445 454-456 461 
464-467 470 472 474-476 483 488 
497 500 504 506 513 516 519-520 
524 526 530-531 534 537-540 549 
554-555 565-566 569-570 572-573 
575-577 586-587 595 603-604 606 
612 630-632 634 636 647 650 657- 
660 666-667 669 673-675 678 698 
700 703 708 720 725-726 731 738- 
739 743-744 750-753 757 759 763- 
765 767 772-779 787 789-790 798 
800 810 823 829 834-836 841 848 
854-856 859 861 864 870-871 881 
890-891 898 908-909 913 928 933 
941 949 958 961 963 967 969 975 
981 986 988-990 992 999 1007- 
1008 1014 1016 1039 1041 1073- 
1074 1079 1089 1097 1109 1114- 
1117 1122 1131 1140-1141 1144- 
1145 1163 1172 1175-1177 1186 
1196 1198 1206 1211 1216 1220 
1223 1227 1234-1243 1261-1262 
1267 1271 1280-1281 1284 1290 
1308 1317-1320 1322 1324-1325 
1327 1330 1334-1335 1339 1346 
1350-1351 1355 1357 1360 1370 
1374 1377-1379 1386 1389-1390 
1392 1397 1400 1402 1406-1407 
1417 1423 1425-1427 1440-1441 
1466 1474 1477 1483 1493 1498 
1504 1506 1525 1536 1545 1549 
1566 1594 1598-1600 1608 1611 
1614 1621 1623 1625 1632 1639 
1641 1644 1647 1649 1653-1656 
1658 1662-1663 1671 1673 1678- 
1681 1686-1688 1693 1705 1707 
1711 1717-1718 1726-1727 1731- 
1733 1737-1738 1743-1745 1758- 
1761 1771-1772 1779 1786 
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Ti 6 sue Origin 



thyroid gland 



trachea 



RNA Source 



Clontech 



Clontech 



Hyseq 
Library Name 



THRO 01 



TRC001 



SEQ ID NOS: 



4 9-10 20-21 37-39 48 50-51 54- 
57 60-61 65-66 71 83 94-96 98- 
100 102 104 110 112 115-117 119 
123 127 133 136-137 140 149 152- 
153 155-158 163-164 1C8-169 171 
186 190-192 197 201-203 219-220 
229 233-237 246-247 253 256 258 
262 265-266 268-269 277 280-281 
284-286 288-289 298-299 302 309- 
311 317 321 326 332 335 341-342 
344 348 350 354 358-359 363 368 
371-373 382-383 385 394 398 400- 
401 411 414-415 421 424 430-431 
433-436 443-446 450-452 454-455 
458 472-474 476-478 482 484-485 
487-488 490-494 496-497 500-501 
503-504 506 509-513 516-517 519 
524 526-527 529 53S-540 547 549 
562 564 569-570 575-576 588 594- 
595 601-602 604 606 610 612 615- 
617 619-623 628-630 634-635 642 
647 649-651 660 662-665 668 670 
681 690-694 696 698 700 709 721 
727-729 732 734 738 740-741 743 
745 750 759 761 763 765 770 773 
780 785 795-796 798 802 804 823- 
824 826 828 833 838 841-845 847 
849 857-860 867 874-875 878 88C- 
881 887-888 890-892 894-895 898 
908 910-911 913-914 922-923 926- 
927 929 932-934 937 939 941-942 
948 9S3 957 961 963-964 966 978- 
979 981-982 937 990 9.92 1001 
1004-1006 1010 1014 1020 1024 
1033 1038-1039 1044 1047 1050 
1052-1054 1056 1058 1068 1070- 
1071 1077-1079 1088 1094-1097 
1105-1106 1112-1113 1116-1117 
1124 1126 1128-1129 1131 1134 
1136-1137 1142-1143 1146-1147 
1149-1150 11S6 1161-1164 1167 
1170-1173 1177-1181 1190 1192 
1197 1200 1204 1208-1209 1214 
1217 1219 1222 1230 1232-1233 
1235 1241 1245 1247 1254 1257- 
1258 1260 1262 1271-1273 1283 
12B6-1289 1299 1306 1314 1320 
1330-1332 1334-1335 1342 1345 
1349 1365-1367 1370-1372 1374 
1381 1394 1407 1419 14281436- 
1437 1440-1441 1443 1446-1449 
1454 1459 1461-1462 1468 1470- 
1471 1475 1477 1479 1482 1491 
1497-1498 1504-1505 1507 1513 
1522 1524-1526 1528 1531 1534 
1536-1537 1548 1550 15S3 1555- 
1559 1562 1567 1578 1590-1591 
1597 1599-1601 1612 1614 1616 
1619-1620 1622 1624-1626 1628 
1631-1632 1634 1536 1639 1644- 
1645 1648 1651 1653-1656 1658 
1660 1662-1663 1667 1669 1671 
1675 1678-1681 1683-1686 1689 
1691-1692 1703 1709-1711 1717 
1724-1726 1729 1734 1737-1738 
1740 1743-1744 1749 1753 1759- 
1761 1770 1777 1786 



29-31 46 48 87 104 107 HO 135 
158 222 262 266 286 301 318 331 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 








SEQ 


ID NOS: 










352 


372 


377 


384 


414 


424 


445-446 








454 


472 


474 


491 


496 


560 


579 588 








S93 


597 


607 


612 


626 


681 


702 719 








810 


BS9 


866 


878 


894- 


895 


912 916 








S22 


932 


935 


104 


6 1075 1080 1099- 








1102 


1113 


1208 


1215 


1232 


-1233 








1237 


1281 


1312 


1385 


1387 


1405 








1414 


1424 


1430 


1437 


1447 


1505 








1569 


1579 


1586 


1600 


1641 


1653 








1667 


1671 


1676- 


1677 


1683 


1691- 








1692 


1711 


1717 


1726 


1772 




uterus 


Clontech 


UTR001 


17 19 25 41 


46 


57-58 61 


89 104 








108 


139 


152 


174 


198 


200- 


201 206 








263- 


265 


274 


290 


387 


408 


420 438 








446 


448 


452 


4 73 


491 


493 


499 503 








506 


513 


519 


522 


526 


530 


542-543 








560 


601 


610 


632 


659 


665 


720 751 








773 


780 


833 


845 


857 


872 


877 912 








929 


934 


937 


996 


1009 1011 1018 








1050 


1075 


1107 


1124 


1170 


1219 








1258 


1279 


1287 


1310 


1320 


1323 








1343 


-1344 


1375 


1437 


1451 


-1452 








1478 


1481 


1498 


1519 


1521 


1536 








1552 


1579 


1597 


1602 


1606 


1620 








1626 


-1627 


1649 


1652 


1661 


1670 








1719 


1722 


-1723 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 




DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1 


Y41736 


Homo 
sapiens 


Human PROH14 protein 
sequence . 


1398 


100 


2 


Y66656 


Homo 
sapiens 


Membrane -bound protein 
PR0943. 


23 89 


99 


3 

4 
5 


AF113136 

AF017806 
X02761 


Homo sapiens 

Mus musculus 
Homo sapiens 


IIj-1 receptor-associated- 
kinase-M; IRAK--M 
Zn-15 transcription factor 
fibronectin precursor 


3043 
6351 


100 
77 


6 
B 


X02761 
X02761 


Homo sapiens 
Homo sapiens 


fibronectin precursor 
fibronectin precursor 


10535 

8990 

12S64 


98 
89 
99 


9 


AJ011679 


Homo sapiens 


Rab6 GTPase activating 
protein, GAPCenA 


5251 


99 


10 


W88501 


Homo sapiens 


Human stomach carcinoma clone 
HP104 15 -encoded protein. 


2381 


100 


11 


AF117754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP240 


11336 


98 


12 
"13 


Z97630 


Homo sapiens 


dJ466Ni.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G) ) ) 


896 


100 




Y58620 


Homo sapiens 


Protein regulating gene 
expression PRGE-13. 


1894 


98 


14 

is 


AF213457 
AF233453 


Homo 
sapiens 
Homo sapiens 


Cr i£jgering receptor expressed 

on myeloid cells 2 

RACK- like protein PRKCBP1 


1238 


lOO 


17 
18 


AF201303 
AF0642O5 


Homo sapiens 
Homo sapiens 


dhfr oribeta- binding protein 
RIP60 

dynactin 1 pl50 isoform 


3124 
3130 


99 

OR 


19 


U00059 


Saccharomyce 
e cerevieiae 


Ynrl21wp 


63 77 
174 


JLUU 

26 


20 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1801 


99 


21 


AB032903 


Homo sapiens 


guanosi ne tr.onophospha t e 
reductase isolog 


1485 


99 


22 


AF140507 


Homo sapiens 


Ca2+ /calmodulin- dependent 
protein kinase kinase beta 


3083 


99 


23 


AF14 0507 


Homo sapiens 


Ca2+/ calmodulin- dependent 
protein kinase kinase beta 


2300 


99 


24 


AJ289131 


Homo sapiens 


chondroitin 4-0- 
sulfotransf erase 


2211 


99 


25 

26 
27 


U33460 

Y444B8 
1743 701 


Homo 
sapiens 
Homo sapiens 
Homo sapiens 


DNA- directed RNA polymerase 
I, largest subunit 
ACRP30R2 variant protein, 
ribosomal protein L23a 


8 777 
1387 


98 

100 . 


28 ] 
29 


U02032 
Y41324 


Homo sapiens 
Homo sapiens 


ribosomal protein L23a 
Human secreted protein 
encoded by gene 17 clone 
HNFIY77. 


791 
767 
1083 


100 

97 

99 


30 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


715 


90 


31 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


631 


82 


32 


AF231917 


Homo sapiens 


long -chain 2 -hydroxy acid 
oxidase HAOX2 


1811 


100 


33 
34 


Z29481 
AB001451 | 


Homo sapiens 
Homo sapiens 


3-hydroxyanthranilic acid 

di oxygenase 

Sck 


1507 


99 


35 


Y00644 


Homo sapiens 


precursor polypeptide {AA -34 
to 287) 


2869 
1667 


100 
99 


36 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1104 


98 


37 " 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence. 


3586 


78 


38 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence . 


4726 


99 
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SPECIPS 


DESCRIPTION i 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


39 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) ammo 
acid sequence . 


3556 


77 


40 


U93121 


Homo sapiens 


M-phase phosphoprotein-l 


3747 


100 


41 


Y4 2750 


Homo sapiens 


Human calcium binding protein 
1 (CaBP-1) - 


795 


100 


42 


AF282626 


Homo sapiens 


latexin 


1189 


100 


43 


G02150 


Homo sapiens 


Human secreted protein, SEQ 
ID WO: 6231. 


384 


94 


44 


U19617 


Mus musculus 


Elf-1 


2724 


88 


45 


U19617 


Mus musculus 


Elf-1 


2062 


86 


46 


AF1O0758 


Homo sapiens 


osteoinductive factor OIF 


1530 


100 


47 


Y87591 


Homo sapiens 


Human SPROUTY-l protein, SEQ 
ID NO:24. 


1737 


99 


49 


X04145 


Homo sapiens 


T3 gamma precursor (aa -22 to 
160) 


942 


99 




X63 54 7 




oncogene 


S045 


99 


52 


M94043 


Rattus 


rab-related GTP -binding 
protein 


1089 


96 


53 


L317B3 


Mus musculus 


uridine kinase 


917 


71 


54 


AO J i? / J 


Homo sapiens 


f- >- 73 t~j o /"•■»" i v~iY~ i fsnhny 

LX.wUot.jl ipt. J.UH J. CI t_ U X. 


4486 


98 


55 


AF224741 


Homo sapiens 


chloride channel protein 7 


4128 


99 


56 


W74 80b 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 

rtUtfto A 1 - 


14 91 


10O 


57 


250907 


Homo sapiens 


Human TBC-1 cDNA from second 
transcript . 


4 824 


100 


58 


D79994 


Homo sapiens 


similar to ankyrin of 
Liironia l j. um vi ii osuin . 


6089 


99 


59 




- . — 
Homo sapiens 


similar co duK.y i in 
Chromatium vinos um . 


4014 " 


91 


60 


Y59738 


Homo sapiens 


Human normal ovarian tissue 
derived protein 15 . 


601 


100 


61 


AB031069 


Homo sapiens 


protein containing l-aal 
domain 1 


1390 


100 


62 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PDm n i 

trt\\s / oj . 


24 92 


99 


63 


Y66660 


Homo 


Membrane -bound protein 


1709 


99 


64 


S70011 


Rattus sp. 


tricarboxylate carrier 


895 


55 


65 


AF13951B 


Rattus 
noxvegicus 


A-kinase anchor protein 


178 


24 






UQulO Sap X C-iii5 


Wr-vmrk RaniAnq T "Jf)R 1 clone 

secreted protein. 


157 


30 


67 


AJ24573 8 


Homo sapiens 


claudiii-15 


1206 


100 


6 8 


JVWn QQ1 ft 


norvegicus 


GLUT4 vesicle protein 


4183 


87 


69 


AFQ9913 8 


Rattus 
norvegicus 


GL»UT4 vesicle protein 


4906 


86 


70 


282059 


Caenorhabdit 
is elegans 


Similarity to Drosophila ring 
canal protein comes from 
this gene 


1285 


44 


71 


AF224278 


Homo sapiens 


PMEPAI protein 


1282 


100 


72 


AF126426 


Homo sapiens 


neurotrimin 


1809 


100 


73 


Y41652 


Homo 


Human MSK2 protein sequence. 


2065 


99 


74 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


1207 


100 


75 


AF188622 


Mus musculus 


selectively expressed in 
embryonic epithelia protein- 1 


1485 


74 


"76 


AE000406 


Escherichia 
coli 


putative DNA topoisomerase 


950 


100 


77 


X99302 


Homo sapiens 


Popl 


655 


100 


78 


AL136538 


Schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktil2 protein 


210 


31 


79 


AF129756 


Homo sapiens 


G4 


1554 


99 
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SMITH- 
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SCORE 


IDENTITY 


80 


AL0967 68 


Homo sapiens 


dJ858B16.2 
(phosphatidyl serine 
decarboxylase (PSSC, EC 
4.1.1.65)) 


2033 


100 


81 


AL»096768 


Homo sapiens 


CU858B16.2 
[phosphatidyl serine 
decarboxylase (PSSC, EC 
4.1.1. 65) ) 


1220 


96 


82 


XS7351 


Homo sapiens 


1-8D 


677 


98 


83 


AC005594 


Homo sapiens 


R26984 1 


2700 


98 


84 


X73 113 


Homo sapiens 


fast MyBP-C 


5959 


S9 


8b 


AF097330 


Homo sapiens 


HI chloride channel; p64Hl; 
CL.IC4 


1305 


99 


86 


AB018423 


Mus musculus 


SH2 domain- containing protein 


1360 


78 


87 


AF272151 


Homo sapiens 


adaptor protein CIKS 


3084 


99 


88 


AF196329 


Homo 
sapiens 


triggering receptor expressed 
on monocytes 1 


1214 


100 


89 


AB016879 


Arabidopsis 
thaliana 


contains similarity to pre- 

mRNA splicing 

f act or~gene_id : MRB1 7 . 2 


634 


36 


90 


AJT133 721 


Mus musculus 


homeodomain protein 


654 


57 


91 


AJ242864 


Mus musculus 


phtf protein 


619 


61 


92 


A61971 


unidentified 


MCSP 


11676 


99 


93 


Y99365 


Homo sapiens 


Human PRO1250 (UNQ633) amino 
acid sequence SEQ ID NO: 86. 


3890 


100 


94 


Y87231 


Homo sapiens 


Human signal peptide 
containing protein HSPP-8 
SEQ ID NO: 8. 


1031 


100 


95 


AF227741 


Rattus 
norvegicus 


protean kinase WNK1 


2428 


95 


96 


AF227741 


Rattus 
norvegicus 


protein kinase WNK1 


1961 


94 


97 


Y92513 


Homo sapiens 


Human OXRE-10. 


1626 


100 


98 


AL021366 


Homo sapiens 


CICK0721Q.3 (Kinesin related 
protein) 


3423 


100 


99 


AC00S733 


Homo sapiens 


R33083 1 "■ " 


1974 


99 


100 


Y95293 


Homo sapiens 


Human GEF containing NEK- like 
kinase substrate sGNK. 


4092 


99 


101 


AL118501 


Homo sapiens 


dJiisiNie.l (A novel protein 
{translation of the cDNA 
DKFZp566A0946, Em: AL050069) ) 


1509 


100 


102 


AjJ006267 


Homo sapiens 


ClpX-lilce protein 


3233 


1O0 


103 


AF100753 


Homo sapiens 


ancient ubiquitous 46 kDa 
protein AUP1 


2042 


96 


104 


AB015982 


Homo sapiens 


serine/t"hT<*nni v -i r-i ^ o e» i 


4718 


100 


105 


AF151074 


Homo sapiens 


HSPC24Q 


831 


64 


106 


M35522 


Canis 

familiari3 


GTP-bmding protein <rab7) 


354 


50 


107 


R99800 


Homo sapiens 


NTII-l nerve protein, 
facilitates regeneration of 
nerve cells. 


2337 


93 


108 


AF125533 


Homo sapiens 


NADH- cytochrome bS reductase 
isof orra 


1290 


93 


109 
110 


AC005614 
AF064729 


Homo sapiens 
Homo sapiens 


F23269 2 ' 
RAN binding protein 16 


3369 


99 


HI 

112 - ' 


XS242S 


Homo sapiens 


lnterleukin 4 receptor 


3285 
4496 


100 
100 




Y41686 


Homo 
sapiens 


riunian fkujc / ** protexn 
sequence. 


2285 


100 


113 


W15506 


Homo sapiens 


Mitogen activating protein 
kinase ERK1. 


1991 


100 


U4 


Y71071 


Homo sapiens 


Human membrane transport J 
protein, MTRP-16. 


1190 


99 


115 


AL049548 


Homo sapiens 


dJ398G3.1 <ortholog of rat 
CPG2) 


3497 


99 


116 
117 


AF189817 
W30891 


Mus musculus 
Homo 


evectin-2 

Human cytostatin III protein. 


1124 
715 


90 

99 "' 
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DESCRIPTION 
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WAT2RMAN 
SCORE 


IDENTITY 






sapiens 








118 


AF116618 


Homo sapiens 


PRO1038 


1469 


100 


119 


Y08915 


Homo sapiens 


alpha 4 protein 


1748 


100 


12C 


AF098070 


Drosophila 
melanogaster 


Lisl homolog 


192 


39 


121 


AF052432 


Homo sapiens 


katanih p80 subunit 


181 


37 


122 


Y70743 


Homo sapiens 


PSEQ-1 protein encoded by 
NSEQ gene associated with 
matrix remodelling. 


2637 


98 


123 


AF083246 


Homo sapiens 


HSPC028 


2132 


100 


124 


Y27096 


Homo sapiens 


Human viral receptor protein 
(ACVRP) . 


833 


99 


125 


M63109 


Lei shmania 
major 


glycoprotein 96-92 


172 


27 


126 


U75467 


Drosophila 
melanogaster 


Atu 


935 


36 


127 


Z68220 


Caenorhabdi t 
is elegans 


Similarity to Human ADP/ATP 
carrier protein 


438 


43 


128 


AF095927 


Rattus 
norvegi cus 


protein phosphatase 2C 


1927 


94 


129 


W92958 


Homo sapiens 


Human zsig44 protein. 


463 


100 


130 


AF115391 


Lactobacilli! 
s sakei 


ribokinaoe RbsK 


508 


37 


j. j j. 








1250 


100 


132 


X93498 


Homo sapiens 


2l-Glutamic Asid-Rich Protein 


916 


87 


133 


W52B11 


Homo sapiens 


Human DBI/ACBP -like protein 
(DBIH) . 


705 


97 


134 


Y84444 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


3230 


100 


13 5 


no y lo-l 


Homo sapiens 


non ~ muscle myosin B 


IRQ 


20 


136 


t.TTil Q a o 


Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 
HE6FL83. 


480 


100 


Xj / 


rv / OtUU 


iiOmo sapiens 


encoded by gene 75 clone 

rmu/^u ox. 


855 


99 


13 8 


xVLJV -5 -3 —J ^ W 




cL734 1 (fiimilar to 
KIAA0701 protein) 


424 


39 


139 


AF020261 


San t alum 
album 


proline rich protein 


119 


30 


140 


X70394 


Homo sapiens 


zinc finger protein 


1634 


100 


141 


Y06439 


Homo sapiens 


Human protease HUPM-8. 


936 


100 




2684 93 


Ca^noT'ha.liciifc 

is elegans 


predicted using Gene finder 


365 


42 


143 


AB018107 


Arabidopsis 
thai i ana 


ADP-ribosylation factor- like 
protein 


596 


65 


144 


AF161483 


Homo sapiens 


HSPC134 


580 


51 


145 


Y84902 


Homo <;aTii pnn 


A .human proliferation and 
apoptosis related protein. 


480 


100 


14 6 


AB004906 


I porno e a 
purpurea 


transposase 


146 


20 


147 


AC007357 


Arabidopsis 
thaliana 


F3F19.18 


647 


31 


14 8 


W75155 




Human secreted protein 
encoded by gene 41 clone 
HNTME13. 


1494 


98 


149 


AF056490 


Homo sapiens 


cAMP -specific 
phosphodiesterase 8 A 


3710 


99 


150 


Y58171 


Homo 
sapiens 


Human hydrolase homologue 
HHH-7 . 


785 


99 


151 


U10397 


Saccharomyce 
s cerevisiae 


Yhrl46wp 


515 


53 


152 


X73478 


Homo sapiens 


phosphotyrosyl phosphatase 
activator 


1719 


99 


153 


AL04 9697 


Homo sapiens 


dJ382H0.5.i (novel protein 


2034 


99 
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IDENTITY 








similar to arginyl - tRNA) 






154 


AF169802 


Homo sapiens 


cytochrome b5 reductase b5R.2 


1455 


99 


155 


X94703 


Homo sapiens 


rab28 


1126 


99 


156 


Y25726 


Homo sapiens 


Human secreted protein 
encoded from gene 6 . 


14 71 


100 


158 


W77404 


Homo sapiens 


Secreted salivary polypeptide 
zsig32 . 


937 


100 


159 


Y17248 


Homo sapiens 


Human protein kinase 
inhibitor- 2 (PKI-2) . 


383 


100 


160 


J04970 


Homo sapiens 


carboxypeptidase M precursor 


2395 


100 


161 


WS4040 


Homo sapiens 


Human interferon-inducible 
protein, HIFI. 


484 


98 


162 


AL022724 


Homo sapiens 


dJ413H6.i.i (hamster 
Androgen- dependent Expressed 
Protein LIKE PUTATIVE 
protein) (isoform 1) 


1357 


100 


163 


AF125535 


Homo sapiens 


pp21 homolog 


193 


45 


164 


G03632 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7713 . 


463 


97 


165 


AJ250839 


Homo sapiens 


serine/ threonine protein 
kinase 


"1442 


71 


166 


L09649 


Zymomonas 
mobilis 


zm2 


173 


37 


167 


Y73337 


Homo sapiens 


HTRT1 clone 1944 530 protein 
sequence. 


1204 


100 


168 


W88645 


Homo sapiens 


Secreted protein encoded by 
gene 112 clone HUKFC71 . 


1084 


100 


169 


AF214731 


Homo sapiens 


ATP- dependent RNA he li case 


4402 


100 


170 


AE000871 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


166 


27 


171 


Y27684 


Homo sapiens 


Human secreted protein 
encoded by gene No. 118. 


821 


100 


172 


AF226044 


Homo sapiens 


HSNFRK 


2904 


100 


173 


AJ245946 


Homo sapiens 


neuroglobin 


779 


100 


174 


D43949 


Homo sapiens 


This gene is novel . 


3202 


100 


175 


Y07923 


Homo sapiens 


GTP-binding protein 


1205 


100 


176 


W90338 


Homo 
sapiens 


Human DPI homologue protein. 


966 


100 


177 


Y41675 


Homo sapiens 


Human channel -related 
molecule HCRM-3 . 


1122 


100 


178 


Y416 74 


Homo sapi ens 


Human channel -related 
molecule HCRM-2 . 


936 


n 99 


179 


AF220492 


Homo sapiens 


krueppel-like zinc finger 
protein HZF2 


4100 


99 


180 


X03084 


Homo sapiens 


Clq B-chain precursor 


1240 


100 


181 


U5734 4 


Mus musculus 


Meis3 


1813 


89 


183 


U57344 


Mus musculus 


Meis3 


1743 


86 


184 


U57344 


Mus musculus 


Meis3 


1070 


86 


185 


AF033120 


Homo sapiens 


pS3 regulated PA26-T2 nuclear 
protein 


1389 


58 


186 


AF200357 


Mus musculus 


pantothenate kinase 1 beta 


1605 


82 


187 


W7505 8 


Homo sapiens 


Human secreted protein 
encoded by gene 2 clone 
HLDBG33 . 


1188 


99 


188 


AJ292529 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 


190 


X54134 


Homo sapiens 


protein- tyrosine phosphatase 


3705 


100 


191 


Y22203 


Homo sapiens 


Human calcium-binding 
phosphoprotein, CBPP-i, 
protein sequence. 


1083 


99 


192 


W63692 


Homo 
sapiens 


Human secreted protein 12 . 


1975 


100 


r T§3 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase {H-SGK2 ) 
polypeptide . 


2605 


99 
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194 


AF084259 


Mus mus cuius 


bromocLoma in — containing 


693 


54 


195 


YO 0752 


Rattus 
norvegicus 


covino Hi^hvciTra \~ (AA 1 - 
327) 


994 


61. 


196 


Til ft C 'J A n 


Homo sapiens 


protein fhl70 7. 


2596 


100 


197 


AB028859 


Homo sapiens 


hDj 9 


1890 


100 


198 


W95633 


Homo sapiens 


u nrnn t-ani p»n ^ secreted OTOtein 
gene clone hm236 1 • 


1614 


100 


199 


Y44277 


Homo 
sapiens 


Human nucleic acid methylase- 

2. 1 


2096 


99 


200 


AB030039 


Homo sapiens 


nf.ftW.Jr JJJ. 


2258 


100 


201 


JC54162 


Hotno sapiens 


64 Kd autoantigen 


2918 


99 


202 


G02061 


Homo sapiens 


Huiuan secreteu pzuLcaii, 
ID NO: 6142. 


558 


99 


203 


X13885 


Nicotiana 
t aba cum 


extensxn lAA 1-620} 


185 1 


33 


204 


004204 


Bos taurus 


_, __• - ■ — r — i— 

32 K.d accessory protein 


1837 


100 


205 


J04204 


Bos taurus 


32 kd accessory protein 


1101 


100 


207 


Y87283 


Homo sapiens 


Human signal peptide 
containing protein HSPP-60 
SEQ ID NO: 60 . 


1318 


100 


208 


Y02B60 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


936 


98 


209 


AU121889 


Homo sapiens 


djJ1076E17,l IKIAA0823 protein 
(continues in ajjO^joujj; 


694 


54 


210 


AF226732 


Homo sapiens 


NPB007 


1345 


76 


211 


X66295 


Mus rausculus 


Ciq C chain 


970 


73 


212 


Z29328 


Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


966 


100 


213 


229328 


Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


542 


98 


214 


AO002030 


Homo sapiens 


progresterone binding protein 


1163 


100 


215 


X70649 


Homo sapiens 


member of DEAD box protein 
family 


3 933 


100 


216 


AF250558 


Homo sapiens 


claudin-2 


1169 


99 


217 


AL021453 


Homo sapiens 


dJ821Dll.l (PUTATIVE protein) 


259 




218 


Y08565 


Homo sapiens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 

se 


3331 
— 


99 


219 


Y94452 


Homo sapiens 


Human inflammation associated 
protein 


2067 


xoo 


220 


AJU035B21 


Arabidopsis 
thaliana 


putative protein 




42 


221 


AL031786 


Schizosaccha 

romyces 

pombe 


putative proline- trna 
synthetase 


Bll 


41 


222 


AL109736 


Schizosaccha 

xomyces 

pombe 


WD repeat protein 


626 


40 


223 


X52493 


Glycine max 




136 


23 


224 


7\r m cccq 


Homo sapiens 


" "Atqtqni *i fdiJ97 9Nl lV 


5199 


98 


225 


AB032401 


Mus musculus 


mmDj4 


1761 


92 


226 


AB032401 


Mus mus cuius 


mmD j 4 


1988 


92 


227 


X83502 


Saccharomyce 
s cerevisiae 


J10O7 


112 


26 


228 


" X83502 


Saccharomyce 
3 cerevisiae 


J1007 


79 


25 


229 


AF143723 


Homo sapiens 


heat shock protein HSP60 


2557 


99 


230 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828 . 


982 


100 


231 


AB027466 


Homo sapiens 


sporidin 2 


1756 


99 


232 


W95634 


Homo 
sapiens 


Homo sapiens secreted 
protein. 


1391 


100 


233 
234 


W00365 
Y53762 


Homo sapiens 
Homo sapiens 


Human cyclin Bl - 

A GTP -binding polypeptide 


2218 
1017 


99 
100 
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SBQ 
ID 
Nu: 

235 


ACCESSION 
! NUMBER 

Z50749 


SPECIES 
Homo sapiens 


DESCRIPTION 

designated RAQ. 
yeast sds22 homolog 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


[ 236 
237 


Z50749 
AB026491 


Homo sapiens 
Homo sap ien s 


yeast sds22 homolog 
PICKl 


1800 
1754 
2137 


100 

98 

100 


238 


AJ270205 


Entodinium 
cauda turn 


putative 

phospha t i dyl inos i t ol - 4 - 
phosphate 5-kinase 


114 


37 


239 


AB030189 


Mus musculus 


contains transmembrane (TM) 
region and ATP binding region 


710 


93 


24 0 


W5653 8 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3785 


99 


241 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3436 


99 


242 


AF155107 


Homo sapiens 


NY -REN- 3 7 antigen 


996 


99 


243 


AF155107 


Homo sapiens 


NY- REN- 37 antigen 


1005 


100 


244 


AL031320 


Homo sapiens 


dJ20N2.1 (novel protein 
similar to yeast and 
bacterial cytosine 
ucaiuinase ) 


763 


99 


245 


U37026 


Rattus 
norvegicus 


sodium channel beta 2 subunit 


162 


30 


246 


AL078599 


Homo sapiens 


dJ991C6.1 (novel protein 
similar to C. elegans 
F55A12.9 (Tr:P91086)) 


"2391 


98 


24 7 
"24 8 


U32274 


Saccharomyce 
s cerevisiae 


Ydr3 86wp; CAI: 0.12 


191 


37 


24 9 


Y41719 
AB029434 


Homo 
sapiens 
Homo sapiens 


Human PROS 64 protein 
sequence . 
ghrelin precursor 


1079 


100 


250 


X97831 


Rattus 
norvegicus 


carnitine/acyl carnitine 
carrier protein 


611 
246 


100 
38 


251 


W80993 


Homo 
sapiens 


Human RIP- interacting factor 
RIF. 


1724 


100 


252 


Y94873 


Homo 
sapiens 


Human protein clone HP02632. 


1876 


100 


253 


W59878 


Homo sapiens 


Amino acid sequence of the 
cDNA clone AIF-2 (HEBGM49) . 


765 


100 


254 
255 


AL354533 
AF233322 


Leishmania 
major 

Mus musculus 


possible adenylate kinase 
zinc transporter like 2 


265 


34 


256 


Y78113 


Homo sapiens" 


Human cytokine signal 
regulator CKSR-1 SEQ ID 
NO-.l. 


1936 
2247 


95 
99 


2S7 


AI>035539 


Arab i dops i s 
thai i ana 


putative amino acid transport 
protein 


390 


27 


258 
~^59 


W74787 


Homo sapiens 


Human secreted protein 
encoded by gene 58 clone 
HHFHN61 . 


1171 


100 




AD03S689 


Homo sapiens" 


dai8 7Jll.l (novel protein 
similar to protein kinase C 
inhibitors) 


974 


100 


261 
262 


AEQ00909 

AL050131 
AF019661 


Methanobacte " " 
rium 

t he rtnoau tot r 
ophicum 
Homo sapiens 
Mus musculus 


serine/ threonine protein 
kinase related protein 

hypothetical protein j 


363 
626 


30 

ioo 


263 
264 

265 


AL0355S3 
AL022318 

AF205940 


Homo sapiens 
Homo sapiens 

Homo sapiens 


zeca proteasome chain; PSMA5 
cuJ3io»J6.1 (novel protein) 
bK150C2.3 (PUTATIVE novel 
protein similar to APOBECl) 
endomucin 


1214 

821 

1072 

1289 


16b 

100 
100 

100 


266 
267 


AL023583 
AJL034548 


Homo sapiens " 
Homo sapiens 


da50OJL14.1 (novel protein) 
OUJ1103G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein C8FW) 


789 
1888 


100 
99 
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ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


268 


AF161470 


Homo sapiens 


HSPC121 


1884 


98 


269 


AP161470 


Homo sapiens 


HSPC121 


1232 


96 


270 


X90763 


Homo 
sapiens 


HHaS hair keratin type I 
intermediate filament 


2190 


99 


271 


AF207600 


Homo sapiens 


ethanolamine kinase 


19S2 


100 


272 


M32334 


Homo sapiens 


intercellular adhesion 
molecule 2 


1436 


100 


273 


AF161483 


Homo sapiens 


HSPC134 


663 


61 


274 


Y53052 


Homo sapiens 


Human secreted protein clone 
df202_3 protein sequence SEQ 
ID NO: 110. 


587 


100 


276 


Y77576 


Homo sapiens 


Human cytoskeletal protein 
(HCYT) (clone 2195418) . 


762 


100 


277 


AF077042 


Homo sapiens 


30s ribosomal protein S7 
homo log 


1269 


100 


278 


Y94907 


Homo sapiens 


Human secreted protein clone 
ca!06_l9x protein sequence 
SEQ ID NO: 20. 


1619 


98 


279 


Y68788 


Homo sapiens 


Amino acid sequence of a 
human phosphor yl a t ion 
effector PHSP-20. 


2801 


99 


280 


Z75134 


Can is 

f ami liar is 


rod transducin 


1816 


100 


281 


Z75134 


Cani3 

familiaris 


rod transducin 


1718 


96 


282 


AF249873 


Homo sapiens 


muscle-specific protein 


1395 


100 


283 


AL.050007 


Homo sapiens 


hypothetical protein 


405 


98 


284 


AF201931 


Homo sapiens 


DCl 


1859 


99 


285 


AF156102 


Homo sapiens 


ELIi complex EAP30 subunit 


1318 


99 


286 


Y35897 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. ' 
146. 


1250 


99 


287 


U889S4 


Homo sapiens 


HEM4 5 


923 


100 


288 


AL050143 


Homo sapiens 


hypothetical protein 


598 


100 


289 


AJ0110S8 


Homo sapiens 


telethonin 


574 


100 


290 


Y66724 


Homo 
sapiens 


Membrane-bound protein 
PR0836. 


2321 


100 


291 


AF034801 


Homo sapiens 


1 ipr in- alpha4 


2565 


98 


292 


AF034001 


Homo sapiens 


liprin-alpha4 


2590 


100 


293 


AL049851 


Homo sapiens 


dJ889J22B.l {novel protein 
(isoform 1) ) 


1738 


ioo 


294 


Y73348 


Homo sapiens 


HTRM clone 839651 protein 
sequence . 


1245 


99 


295 


L11672 


Homo sapiens 


zinc finger protein 


1694 


44 


296 


AU)3S423 


Homo sapiens 


dJ20I3.1 (brain mitochondrial 
carrier protein-1 (BMCP1) ) 


1024 


79 


297 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
factor- l 


2173 


100 


298 


AF161417 


Homo sapiens 


HSPC299 


1147 


85 


299 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


1236 


99 


30D 


U26397 


Rattus 
norvegicus 


inositol polyphosphate 4- 
phosphatase 


160 


30 


301 


AF036145 


Homo sapiens 


meningioma-expressed antigen 
5 


3458 


100 


302 


Z82022 


Homo sapiens 


GlcNac-l-P transferase 


2067 


99 


303 


AF269232 


Mus musculus 


butyrophilin-like protein 
BUTR-1 


271 


50 


304 


AJ222644 


Arabidopsis 
thaliana 


asparaginyl-tRNA synthetase 


659 


50 


305 


AF05418O 


Homo 
sapiens 


hematopoietic cell derived 
zinc finger protein 


351 


79 


306 


AJ272079 


Homo sapiens 


APOBEC-1 stimulating protein 


3056 


100 


308 


Y44486 


Homo 
sapiens 


Human GPRW receptor 
polypeptide. 


1721 


100 


309 


AJ131891 


Homo sapiens 


DNA polymerase mu 


2598 


100 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
SCORE 


I DENT X TY 


310 


AF293335 


Homo sapiens 


p3 0 DBC 


1248 


92 


311 


AF17652S 


Mus thus cuius 


P-box protein FBL12 


1501 


9 3 


312 


X57802 


Homo sapiens 


immunoglobulin lambda light 
chain 


959 . 


81 


313 


Z36715 


Homo sapiens 


Net 


204 8 


9 8 


314 


AF161532 


Homo sapiens 


HSPC047 


727 


100 


315 


AF208068 


Homo sapiens 


kelch-like protein KLHL3a 


3 04 6 




316 


Y66666 


Homo 
sapiens 


Membrane- bound protein 
PRO1013. 


1166 


100 


317 


Y29666 


Homo sapiens 


Human Ras protein RAPR-i. 


1253 


98 


318 


AJ387747 


Homo sapiens 


siaiin 


2614 




319 


AF161362 


Homo sapiens 


HSPC099 


224 


40 


320 


Y68773 


Homo sapiens 


Amino acid sequence of a 

human a h 4 nn 
Human pi juopiiUf y JL a I. lOIl 

effector PHSP-5. 


2243 


99 


321 


AJ238379 


Homo sapiens 


yuLdLive J. ri J. proceill 


3013 


100 


322 


AB040812 


Homo sapiens 


protein Jcinase PAK5 


3792 


99 


323 


Y95013 


WnmA a r» r% "i one 
uUiKU oap^ciio 


Human secreted protein 
vc48_l, SEQ ID NO: 66. 


913 


100 


324 


Y13381 


Homo sapiens 
— . — __ 


Amino acid sequence of 
protein PR0271. 


1976 


100 


325 


Y94944 


Homo sapiens 


Human secreted protein clone 
bf 157^16 protein sequence 
SEQ ID NO; 94. 


2305 


98 


326 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7 sequence . 


6728 


99 


327 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
factor- 1 


2173 


100 


328 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


329 


AF212921 


Mus musculus 


MMTV receptor variant 1 


484 


94 


330 


Z75330 


Homo 
sapiens] 
>R65207 
R65207 02- 
MAR - 1 9 9 5 27- 
AUG- 1993 

Human 

stromalin~l. 
[Homo 
sapiens 


nuclear protein SA-l 


6492 


99 


331 


AL008583 


Homo sapiens 


d J3 2 7ai 6.3 ( suppor ted by 
GEUSCAN, FGENES and GENEWXSE) 


2133 


99 


332 


Y36104 




Extended human secreted 
protein sequence, SEQ ID NO. 
489. 


310 


41 


333 


AJ271669 


Homo sapiens 


putative sialoglycoprotease 


1747 


100 | 


334 


AF156598 


Mus musculus 


p53 -regulated DDA3 


997 


64 


335 


M99058 


Exmeria 
maxima 


erolOO gene is homologous the r 154 
Eimeria tenella gene etlOO j 


26 


336 


Y85564 




Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3386 


97 


337 


Y85564 




Human homologue of UNC-53 
(Hs -UNC- 53 / 1 ) sequence . 


2602 


94 


338 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
<Hs-UNC-53/l) sequence. 


3447 


98 


339 


"Z66561 


Caenorhabdit 
is elegans 


Similarity to Human rabi3 
protein (PIR Acc. No. 
A49647) . 


716 


34 


340 


AB021643 


Homo ~f 
sapiens 


gonadotropin inducible | 
transcription repressor-3 


2761 


99 


341 


G01946 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6027. 


465 


98 


342 


AF020591 


Homo sapiens 


zinc finger protein 


1091 


48 


343 


Ij291S4 


Homo sapiens 


immunoglobulin heavy chain 


439 • 


84 
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VDCT region 






344 


U1D281 


sus scrofa 


gastric mucin 


279 


24 


345 


AK0O04O4 


Homo sapiens 


unnamed protein product 


1177 


99 


346 


L22557 


Rat tus 
norvegicus 


calmodulin- binding protein 


1949 


84 


347 


L22557 


Rat tus 
norvegicus 


calmodulin-binding protein 


2363 


91 


348 


AL049481 


Arabidopsis 
thaliana 


AIGl-like protein 


316 


30 


350 


AJ251516 


Mus rausculus 


cysteine and histidine-rich 
protein 


1460 


99 


351 


AK024477 


Homo sapiens 


FLJ00070 protein 


1773 


100 


352 


U50133 


Homo sapiens 


ankyrin 


502 


33 ; 


353 


AK000625 


Homo sapiens 


unnamed protein product 


721 


100 


354 


AF161420 


Homo sapiens 


HSPC302 


2623 


97 


355 


ACT010014 


Homo sapiens 


M96A protein 


1269 


47 


356 




Homo sapiens 




94 1 


91 


357 


AL022327 


Homo sapiens 


dJ35SC10.1 (KIAA0027) 


1911 


100 


358 


VJ78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 

UnCRT 
IIL/J D ± _J o . 


1117 


1UU 


"1 CQ 


AUi4 Aft 


— . . . ■« 

Drosoprxi la 


r^z. p<jj.ypept- -Lcit; 


316 


45 


360 


AF151079 


Homo sapiens 


HSPC245 


543 


100 


3 61 


YS3 8 86 


Homo sapiens 


A suppressor of cytokine 

c t i'm all i nrr Tt y~o hp"i 
i» xyuai x Any piuucjui 

designated HSCOP-6. 


Kin 


4 1 


3 62 




Drosophi Xa 
ine 1 anogas ter 




681 


46 


363 


AF213465 


Homo sapiens 


dual oxidase 


2016 


100 


3 64 




•Homo sapiens 


proSAAS 


1319 


10 0 


3 65 


At loX?Oi 


Homo sapiens 


v-rvC fine 
prOcxHAo 


1024 


99 


366 


U73200 


Mus mus cuius 


pll6Rip 


8 84 


82 


367 


Afc263744 


Homo sapiens 


erbb2- interacting protein 
ERBIN 


4973 


99 


368 


U37501 


Mus mus cuius 


laminin alpha 5 chain 




72 


369 


AF043695 


Ca enorhabdit 
is elegans 


similar to the protein 
phosphates 2c family 


549 


36 


370 


Y73 440 


Homo sapiens 


Human secreted protein clone 
yj23_l protein sequence SEQ 
ID NO: 102. 


~i a. pa 




J / X 




Homo sapiens 


misato 


2 869 


97 


j i ^ 




Homo sapiens 


epju-ueiidi piuucm iuk *_ in 


3 927 


100 


373 


Y73345 


Homo <tsni^n9 


HTRM clone 43 8283 protein 
sequence . 


273 


80 


374 


AF1 6 9017 

ill. *LD VVi. / 


Homo c»r»r>1^Tl<? 


formiminot ran s f erase 
cycl odeaminase 


2717 


98 


375 


A95106 


unicLentit ied 


RED ALPHA 


1202 


99 


376 


W74828 


Komo sapiens 


Human secreted protein 
encoded by gene 100 clone 
HLQA352 . 


1012 


99 


377 


Y32131 


Homo can i ati 


Human LYST— 2 protein. 


3556 


99 


378 


M14912 


Homo sapiens 


nol 
r ul 


132 


66 


379 


AF090934 


Homo sapiens 


PRO0518 


382 


100 


380 


X66363 


Homo sapiens 


serine /threonine protein 
kinase 


2499 


100 


381 


Y41699 


Homo 
sapiens 


Human PRO703 protein 
sequence . 


2362 


100 


382 


AF17449B 


Homo sapiens 


GR AF-l specific protein 
phosphatase 


7008 


98 


383 


U64608 


Caenorhabdi C 
is elegans 


coded for by C. elegans cDNA 
ykl73cl2.5 


246 


36 


384 


U50133 


Homo sapiens 


ankyrin 


502 


33 


385 


AJ238520 


Homo sapiens 


putative transcription 
factor-like nuclear regulator 


4123 


97 
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ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


387 


AF208845 


Homo sapiens 


BM-003 


1375 


99 


389 


XS7B21 


Homo sapiens 


immunoglobulin lambda light 
chain 


797 


76 


390 


AF1824 04 


Homo sapiens 


mitochondrial uncoupling 
protein 1 


1670 


99 


391 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1 ) sequence . 


3386 


97 


393 


AF178432 


Homo sapiens 


SH3 protein 


3700 


100 


394 


AF229928 


Drosophila 
melanogaster 


cytoplasmic protein 89BC 


1616 


62 


395 


AF181721 


Homo sapiens 


RU2S 


one/ 


100 


396 


Y69197 


Homo sapiens 


Amino acid sequence of a 
human beta IV- spectrin 
protein. 


1626 


98 


397 


U4 8238 


Mus musculus 


ZinC fincier nrnh #=» t r> n<»»i 


749 


60 


398 


AL390137 


Homo sapiens 


hypothetical protein 


263 


51 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 


5337 


60 


400 


AL022599 


Schizosaccha 

rojnyccs 

pombe 


WD repeat protein 


447 


27 


401 


AC004B59 


Homo sapiens 


similar to 2-oxoglutarate 
dehydrogenase similar to 
Q02218 (PID:gl352618) 


4176 


78 


402 


AB010266 


Mus musfiilti« 




10246* 


62 


403 


AL133288 


Homo Qan "t one? 


a.ob/iuf.± isimiiar to 
D. melanogaster CG5986 


761 


100 


404 


Z68753 


is elegans 


"V-J-I.O .JO 


888 


48 


405 


Z78013 


Caenornaod.it: 
is elegans 


oiuiiidtiLy cq' urosopiuia 
wouiiciui *. cia, lcu tumor 
suppressor 


569 


33 


406 


AB031230 


Homo sapiens 


protein containing CXXC 
domain 2 


1196 


97 


407 


AF155106 


Homo sapiens 


NY-REN-36 antigen 


1168 


100 


408 


Y57945 


Homo sapiens 


Human transmembrane protein 
HTMPN-69. 


1538 


99 


409 


Z18361 


Ovis aries 


t r i chohyal in 


184 


30 


410 


AF249744 


Homo sapiens 


RhoGEF 


2733 


100 


411 


AF176529 


Mus musculus 


* yj.sJK.tsA.il rxJA.XJ 


2072 


94 


412 


AF210842 


Homo sapiens 


HARP 


4880 


100 


413 
414 


AL03165B 
X57398 


Homo sapiens 
Homo sapiens 


dJ310O13.7 (novel protein 
similar to H my-e* t- -? vjt> opt _ 
3) 

pm5 protein 


776 


98 


415 


AB029826 


Homo sapiens 


3 -methylcrotonyl -CoA 
carboxylase biotin- containing 
subunit 


6131 


99 
99 


416 


U43503 


Saccharomyce 
s cerevisiae 


Lphlp 


115 


42 


417 
418 


AL160493 
Y08100 


Leishmania 
major 

Homo sapiens 


possible t26fl7.2l 
Human PR0331 protein. 


239 
33 q . 


- ■- 

29 


419 
420 


U1513 1 
AF117946 


Homo sapiens " 
Homo sapiens 


pl26 

Link guanine nucleotide 
exchange factor II 


2228 
2363 


54 
100 


421 


AF190635 


Drosophila 
melanogaster 


anxyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein- 2 


1962 


100 


423 


AL137530 


Homo sapiens 


hypothetical protein 


433 


94 


424 


X63753 


Homo sapiens 


son-a 1 


7269 


100 


425 


AB027249 


Homo sapiens 


MAPKK like protein kinase 


1693 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor j 


1084 


55 
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* « / 




Homo sapiens 


tumor endothelial marker 7 
precursor 


1259 


56 


428 


AE003683 


Drosophila 
me 1 snog 9 s t sr 


CG8312 gene product 


149 


29 


429 


Y07829 


Homo sapiens 


RING finger protein 


2201 


99 


430 




Dro s ophi 1 a 
melanogaster 


pushover 


4442 


47 


431 


U41387 


Homo sapiens 


Gu protein 


4021 


99 


432 


Ar 0^3674 


Homo sapiens 


nephrocystin 


3783 


100 


433 


AF146760 


Homo 
sapiens 


septin 2-like cell division 
control protein 


2284 


100 


434 


AB0Q6697 


Arab i dop sis 
thaliana 


cleft lip and palate 
associated transmembrane 
protein-like 


886 


42 


437 


Y94247 


Homo sapiens 


Human calcium binding protein 
hCBP. 


1704 


100 


438 


AB040672 


Homo sapiens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 

se 


1075 


63 


43 9 


AF105228 


Bos taurus 


tuftelin 


285 


33 


4 40 


R06463 


Homo sapiens 


Derived protein of clone 
ICA13 (ATCC 4 0553) . 


3073 


99 


441 


X14971 


Mus musculus 


alpha-adaptin (A) {AA 1-977; 


4897 


98 


442 


X53773 


Rattus 
norvegicus 


alpha-c large chain (AA 1- 
938) 


3979 


81 


443 


Y66689 


Homo 
sapiens 


Membrane- bound protein 
PROH3 6. 


3299 


99 


444 


AC067754 


Arabidopsis 
t ha liana 


unknown protein; 20348-23707 


114 


33 


445 


AF229032 


Mus musculus 


piL 


2077 


93 


446 


AF056035 


Rattus 
norvegicus 


s-nexilin 


2662 


85 


447 


AF132484 


Mus musculus 


unknown 


4 78 


51 


448 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156 . 


528 


45 


449 


AF161445 


Homo sapiens 


HSPC327 


1606 


100 


450 


Z68753 


Caenorhabdi t 
is elegane 


ZC518 .3b 


951 


49 


4S1 


W39160 


Homo sapiens 


Human partial complement 
factor H protein fragment 3. 


155 


32 


452 


W85727 


Homo 
sapiens 


Novel protein (Clone 
BM46_10) . 


2799 


99 


453 


Y53629 


Homo sapiens 


A bone marrow secreted 
protein designated BMS115. 


2810 


100 


454 


D87438 


Homo 
sapiens 


Similar to a C.elegans 
protein in cosmid C14H10 


4069 


100 


455 


AF240468 


Homo sapiens 


nicastrin 


3687 


100 


456 


Z15005 


Homo sapiens 


CENP-E 


13305 


99 


457 


M59216 


Homo 
sapiens 


gamma -aminobutyric acid 
receptor beta-1 subunit 


2477 


100 


458 


Y73467 


Homo sapiens 


Human secreted protein clone 
yd61_i protein sequence SEQ 
ID NO; 156. 


966 


100 


459 


W67824 


Homo sapiens 


Human secreted protein 
encoded by gene IB clone 


535 


100 


460 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


279 


19 


461 


D87446 


Homo sapiens 


Similar to a C.elegans 
protein encoded in cosmid 
C27F2 (U40419) 


9196 


99 


462 


604044 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8125. 


486 


93 


463 


AC002398 


Homo sapiens 


r F2 5 96 51 


1018 


100 


464 


AF064856 


Rattus sp. 


7acomp protein 


1845 


84 


465 


AF22340 8 


Homo sapiens 


B99 


3686 


99 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


~r s 

IDENTITY 


466 


AF22340B 


Homo sapiens 


B99 


2878 


87 


467 


AF104415 


Mus musculus 


gene trap locus- 13 


6336 




468 


US3450 


Rattus 
norvegicus 


Jun dimerization protein 1 
JDP-l 


196 


49 


469 


AL031297 


Homo sapiens 


dJ97P20.l (novel gene) 


3564 


99 ~" 


470 


AF257077 


Homo sapiens 


eukaryotic translation 
initiation factor EIF2B 
subunit 3 


1274 


95 


471 


L28125 


Podospora 
anserina 


beta transducin-like protein 


284 


38 


472 


Y84903 


Homo sapiens 


A human proliferation and 
apoptosis related protein. 


2337 


100 


473 


AF144237 


Homo sapiens 


LOMP protein 


252 


44 


4 74 


Y71213 


Homo sapiens 


Human irritable bowel disease 
related polypeptide IMX3 9. 


838 


i nn 
JLUU 


4 75 


Y95006 


Homo sapiens 


Human secreted protein 
vel3_l, SEQ ID NO.-52. 


3411 


J.UU 


476 


D38549 


Homo sapiens 


hal025 is new 


6533 


99 


477 


AF241230 


Homo sapiens 


TAK1- binding protein 2 




100 


478 


AL031534 


Sch± zosaccha 

romyces 

pombe 






4 0 


4 79 


L28125 


Podospora 
anserina 


beta transducin-like protein 


233 


26 


480 


AF161544 


Homo sapiens 


HSPC059 


434 


77 


481 


AJ23824 8 


Homo sapiens 


centaurin beta2 


3986 


99 


482 


Z38061 


Sac char omyce 
e cerevisiae 


mal5. stal , len* 1*167 ra.T • " 
0.3, AMYH_YEAST P0864 0 
GLUCOAM YIiAS E SI (EC 3.2.1.3) 


295 


23 


483 


AF161381 


Homo sapiens 


HSPC263 


1404 




484 


AF223468 


Homo sapiens 


AD021 protein 


1314 


100 


486 


X57527 


Homo sapiens 


alpha 1 (VIII) collagen 


4166 


99 


487 


Y19062 


Homo sapiens 


39k3 protein 


2475 


100 


488 


Y73373 


Homo sapiens 


HIRM clone 921803 protein 
sequence . 


555 


56 


489 


AL021918 


Homo 
sapiens 


Jt>3 4 1 S 1 / iCr~l 1 rpl a-oH »7 _ 

Finger protein 184 ) 


4184 


100 


490 


X53773 


Rattus 
norvegicus 


alpha- c large chain (AA 1- 
938) 


acne 
O /D 


97 


"491 


U52426 


Homo -sapiens 


GOK 


1459 


59 


492 


AL359773 


lieishmania 
major 


possible threonine synthase 


702 


45 


493 


AF226614 


Homo sapiens 


ferroportinl 




100 


'494 


Z93241 


Homo sapiens 


dJ222El3.l (novel protein 
with some similarity to 
Drosophila kkakkn) 


513 


96 


495 


AF03 6 977 


Homo sapiens 


unknown 


1812 


100 


496 


U93564 


Homo sapiens 


p40 


133 


45 


497 


Y914 05 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 2 
SEQ ID NO: 126. 


357 


100 


498 


AF069781 


Drosophila 
melanogaster 


Bem4 6-like protein 


6S3 


43 


499 


Y16601 


Homo sapiens 


Human cell -cycle 
phosphoprotein CECYP-2. 


1658 


98 


500 


X70944 


Homo sapiens * 


PTB-associated splicing 
factor 


3883 


100 


501 


AF027503 


Mus 

musculus 


putative membrane- associated 
guanylate kinase 1 


205 


36 


502 


AF282874 


Homo sapiens 


nectin 3,- PRR3 


2856 


99 


503 


AJ249732 


Homo sapiens 


G8 protein ! 


669 


100 


504 


AF208861 


Homo sapiens 


BM-019 


1629 


100 


505 
507 


L09708 
X66285 


Homo sapiens 
Kua musculus 


complement component C2 
HC1 ORF 


4022 


100 


508 


D00189 


Rattus 
norvegicus 


Na+ , K+-ATPase alpha-subunit 


115 
5227 


43 
99 
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TABLE 2 



SEQ 
ID 


ACCESSION 
NUMBER 


SPECIES 


DESCR I PT I ON 


SMITH- 
WATERMAN 


IDENTITY 


509 


Y94971 


Homo sapiens 


Human secreted protein clone 

fal71 1 protein sequence SEQ 

ID NO: 14 8 . 


2176 


100 


510 


ABO19038 


Homo sapiens 


beta-1,4 mannosyl transferase 


781 


77 


511 


/IDVJL _y v -J O 


IJ r~MTi ^ 1 At\<I 
nuiU"s/ adpiCJio 


WCLw JL f ** lUallUUoy J. Ltcuia LcI dbC 


134 7 


100 


512 


ZXROI Qni ft 






152 0 


QQ 


513 


X84908 


Homo sapiens 


phosphorylase kinase 


5729 


99 


514 


X52851 


Homo sapiens 


peptidylprolyl isomerase 


650 


76 


515 


At lobU04 


Homo 
sapiens 


epidermal growth factor 
repeat containing protein 


3046 


99 


516 


G03602 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7683. 


505 


99 


51*7 


U04 /Ob 


Bos t aurus 


50 kDa protein 


1 749 


77 


518 


G00G53 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4734 . 


530 


100 


519 


AF161475 


Homo sapiens 


HSPC126 


1368 


100 


520 


Y99366 


Homo sapiens 


Human PR01475 (UNQ746) amino 
acid sequence SEQ ID NO: 88. 


3394 


97 


521 


AF266852 


Homo sapiens 


PTPLA 


1295 


100 


522 


AE000995 


Archaeoglobu 
s fulgidus 


chromosome segrega t ion 
protein (smci) 


153 


20 


523 


AF06224 9 


Homo sapiens 


immunoglobulin heavy chain 
variable region 


605 


97 


524 


Aa223830 


Rattus 
norvegicus 


ARE1 


2950 


98 


525 


W01535 


Homo sapiens 


Cellular homologue of the 
SV40 large T antigen. 


1276 


83 


526 


AF145658 


Drosophila 
melanogaster 


BcDNA . GH102 2 9 


320 


33 


527 


AF112213 


Homo sapiens 


putative Rab5- interacting 
protein 


524 


79 


523 


D49387 


Homo 
sapiens 


NADP dependent leukotriene b4 
1 2 - hy droxy dehy drog ena s e 


1616 


100 


529 


Y30819 


Homo sapiens 


Human secreted protein 
encoded from gene 9 . 


328 


32 


530 


AL079335 


Homo sapiens 


dJ132F21.3 (72.1 KDa protein 
(DKFZP564A032, SBBI88) 
similar to mouse IFN-gamma 
induce MG11. ) 


1059 


95 


531 


Y91506 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 56 
SEQ ID NO : 179 . 


1159 


98 


532 


X76116 


Caenorhabdit 
is elegans 


carrier protein (c2) 


576 


50 


533 


X76116 


Caenorhabdi t 
is elegans 


carrier protein (c2) 


506 


50 


534 


X12966 


Homo sapiens 


3 -oxoacyl -CoA thiolase 
propeptide (4 24 AA) 


1972 


100 


535 


Y09267 


Homo sapiens 


flavin-containing 
monooxygenase 2 


2486 


100 


536 


Z11773 


Komo sapiens 


SRE-ZBP 


2201 


99 


537 


D84224 


Homo sapiens 


methionyl tRKA synthetase 


4741 


99 


538 


D84224 


Homo sap i ens 


methionyl tRNA synthetase 


3887 


99 


539 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


2933 


96 


540 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


4529 


99 


541 




Bos tauru3 


H+ ATPase 31kDa subunit (EC 
3.6.1.3) 


848 


77 


542 


Y92514 


Homo sapiens 


Human OXRE-li. 


2301 


99 


543 


AF221712 


Homo 
sapiens 


Smad- and 01 f- interacting 
zinc finger protein 


2151 


61 


544 


AE000919 


Methanobacte 
rium 

the rmoau t o t r 
ophicum 


conserved protein 


207 


38 


545 


A06669 


synthetic 
construct 


preTGF-betal 


2070 

i 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 

WJVTPRMniVI 

SCORE 


% 

I DENTITY 


546 
547 


Y02698 
AF112205 


Homo sapiens 
Homo sapiens 


Human secreted protein 
encoded by gene 4 9 clone 
HTPCS60. 
WSB-l protein 


854 
2275 


98 
100 


548 
549 


X60271 
AC0168 27 


Mus musculus 
Arabi clops is 
thaliana 


c-rel 

putative GTPase 


2264 
810 


74 
42 


5S0 


Y70400 


Homo 

sapiens 


Human cell- signalling 
protein- 2 . 


429 


68 


551 


AB048365 


Homo sapiens 


N£DD4-IjLke ubiquitin ligase 1 


8290 


99 


552 


Y57880 


Homo sapiens 


Human transmembrane protein 
HTMPN-4. 


1112 


95 


553 


AF119855 


Homo sapiens 


PR01B47 


265 


67 


554 


Ml 723 6 


Homo sapiens 


MHC HLA-DQ alpha precursor 


1332 


100 


555 




Arabi dops is 
thaliana 


putative protein 


540 


40 


556 
557 


AC006963 
AK024487 


Homo sapiens 
Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
(PID:g4650844> 
FLJO0086 protein 


515 
1623 


4 4 

98 ~ 


558 
559 

560 


M12140 
W74 825 

X56581 


Homo sapiens 
Homo sapiens 

Homo sapiens 


pol gene protein; Xxx 
Human secreted protein 
encoded by gene 97 clone 
HAQBF73 . 
junD protein 


J. -L / 

225 


48 
56 


56X 


AF00313 6 


Caenor Jiabdi t 
is elegans 


an AMP-binding motif 


373 
2926 


88 
54 


562 


AL1D9839 


Homo sapiens 


dJ10€9P2.3.1 {novel PABPC1 
(poly (A) -binding protein) 


877 


100 


563 


AF131640 


Drosophila 
welanogaster 


BcDNA.GH09817 


289 


42 


564 

565 
566 


AF052723 

AF161472 
Y28817 


Feline 

leukemia 

virus 

Homo sapiens 
Homo sapiens 


gag-pol precursor polyprotein 
gPr80 

HSPC123 

pt326_4 secreted protein. 


1547 

439 
3338 


43 

44 

JLUU 


567 
56 9 
570 
571 
572 


U09848 

AF155113 

AF155113 

AL032821 

M69181 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


zinc finger protein 
NY-REN- 55 antigen 
NY-REN- 55 antigen 
<±J55C23.1 (vanin 1) 
non-muscle myosin B 


1738 
3603 
3951 
1821 


100 
93 
99 
98 


"573 
574 


M69181 
Y59678 


Homo sapiens 
Homo sapiens 


non- muscle myosin B 

Secreted protein 108-008-5-0- 

E6-FI,. 


7350 
7311 
772 


99 
98 
100 


575 


AL365234 


Arabi dops is 
thaliana 


putative protein 


788 


4 0 


576 


AJC365234 


Arabidopsis 
thaliana 


putative protein 


788 




577 
578 


X06745 
AB041642 


Homo sapiens 
fiuuivj octpxens 


DNA polymerase alpha-subuni t 
(AA l - 1462) 
PAR— 6 


7619 


99 


579 


D86984 


Homo sapiens 


similar to yeast adenylate 
cyclase (S56776) 


1342 
2446 


100 
100 


580 


AF165124 


Homo sapiens 


gamma -aminobutyric acid A 
receptor gamma 2 


2499 


99 


581 


W88812 


Homo sapiens 


Polypeptide fragment encoded 
by gene 58. 


2339 


99 


582 


082319 


Homo sapiens 


novel ORF 


342 


100 


583 
584 


P92219 
AJ223948 


Homo sapiens 
(human) 
Homo sapiens 


CRl protein - 
RNA helicase 


11425 


99 


585 


Y08612 


Homo sapiens 


88kDa nuclear pore complex 
protein 


6608 
3874 


99 
99 


586 
587 


Y42384 
AF129756 


Homo 
sapiens 
Homo sapiens 


Amino acid sequence of 

Iv3l0 7. 

BAT4 


1007 
1873 


37 
98 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


588 


AF131775 


Homo sapiens 


Unknown 


1929 


99 


589 


AJ250865 


Homo sapiens 


TESS 2 


2348 


100 


591 


Z98885 


Homo sapiens 


dJ522 J7 . 2 (bromodoma in- 
containing 1 (similar to 
peregrin, BR140) ) 


4167 


100 


592 


L76571 


Homo sapiens 


nuclear hormone receptor 


1355 


100 


593 


AF091622 


Homo sapiens 


PHD finger protein 3 


9054 


100 


594 


X56807 


Homo sapiens 


desmocollin type 2a 


4443 


100 


595 


AI>13 7802 


Homo sapiens 


dJ798Al0.1 (novel protein) 


212 


55 


596 


AL022329 


Homo 
sapiens 


DK407F11.2 (adrenergic, beta, 
receptor kinase 2) 


3653 


100 


597 


AF226048 


Homo sapiens 


GL003 


2009 


99 


598 


AJ278112 


Homo 
sapiens] 
>Y49635 
Y49635 21- 
OCT-1999 15- 
APR-1998 
Human sdp3.5 
protein . 
[Homo 
sapiens 


putative cell cycle control 
protein 


335 


23 


599 


Y59741 


Homo sapiens 


Human normal ovarian tissue 
derived protein 10. 


1574 


99 


600 


136531 


Homo sapiens 


integrin alpha 8 subunit 


5386 


99 


601 


Y38458 


Homo sapiens 


Human secreted protein 
encoded by gene No. 20. 


895 


100 


602 


AF218584 


Homo sapiens 


GGA1 


3265 


100 


603 


Y13115 


Homo sapiens 


serine /threonine protein 
kinase 


5071 


99 


604 


AL132776 


Homo sapiens 


dJ393D12.1 (KIAA0776) 


2413 


99 


60S 


AI>034452 


Homo sapiens 


dJ6B2J15.1 (novel Collagen 
triple helix repeat 
containing protein) 


1979 


100 


606 


Y14494 


Homo sapiens 


aralarl 


3465 


99 


607 


AJ001981 


Homo sapiens 


OXA1L 


2603 


100 


608 


X86098 


Homo 
sapiens 


binds directly to adenovirus 
type 5 El A protein 


3069 


100 


610 


AF163572 


Homo sapiens 


Forssman glycol ipid 
synthetase 


1865 


99 


611 


AF161503 


Komo sapiens 


HSPC154 


1261 


97 


612 


L41834 


Ens is minor 


nuclear protein 


345 


30 


613 


Y91954 


Homo sapiens 


Human cytoskeleton associated 
protein 9 (CYSKP-9) . 


3668 


100 


614 


AL022327 


Homo sapiens 


dJ355C18.1 (KIAA0027) 


361 


94 


615 


X85786 


Homo sapiens 


binding regulatory factor 


3203 


100 


616 


Y08319 


Homo sapiens 


Jcinesin-2 


3487 


99 


617 


D12644 


Mus musculus 


KIF2 protein 


3609 


97 


618 


U28789 


Mus musculus 


PACT 


| 5936 


89 


619 


Y35914 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
163 . 


1684 


99 


620 


A3046382 


Mus musculus 


test is -abundant iinger 
protein 


199 


23 


621 


Y00062 


Homo sapiens 


precursor polypeptide (AA -23 
to 1120) 


3440 


99 


622 


AF068286 


Homo sapiens 


HDCMD38P 


861 


100 


623 


X98248 


Homo sapiens 


sortilin 


4436 


99 


624 


X61100 


Homo sapiens 


75 kDa subunit NADH 
dehydrogenase precursor 


3734 


99 


625 


S58544 


Homo sapiens 


75 kda infertility-related 
sperm protein 


2125 


99 


626 


AF151027 


Homo sapiens 


HSPC193 


582 


93 


627 


X14968 


Homo sapiens 


Rll-alpha subunit (AA 1-404) 


2079 


100 


628 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


1983 


100 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

VMM J. tL K I 'U-VIN 

SCORE 


IDENTITY 


629 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein. 


1694 


1 AA 


630 


AF098786 


Homo 
sapiens 


17 be t a -hydroxys tero id 
dehydrogenase type VII 


17S4 


1O0 


631 


AJL.034S55 


Homo 
sapiens 


dJ134019.3 (zinc finger 
protein 151 (pHZ-67) ) 


4273 


100 


632 


W74826 


Homo sapiens 


Human secreted protein 
encoded by gene 98 clone 
HAQBT94 . 


794 


96 


633 


AF288288 


Homo sapiens 


HPT protein 


223£ 


. 100 


634 


AF041429 


Homo sapiens 


pRGRl 


823 


99 


635 


X66357 


Homo sapiens 


serine/threonine protein 
kinase 


1589 


100 


636 


Y11284 


Homo sapiens 


AFX1 


2571 


98 


637 


AR004884 


Homo sapiens 


PKU-alpha 


3718 


99 


638 


AJ002303 


Homo sapiens 


synaptogyrin lc 


1020 


100 


639 


AJ002304 


Homo sapiens 


synaptogyrin lb 


1002 


100 


640 


ACT002303 


Homo sapiens 


synaptogyrin lc 


933 




641 


D87682 


Homo sapiens 


similar to a C.elegans 
protein encoded in cosmid 
T26A5 . 


2676 


100 


642 


M14660 


Homo sapiens 


ISG-K54 


2473 


99 


64 3 


X06661 


Homo sapiens 


calbindin (AA 1-261) 




100 


644 


AF119900 


Homo sapiens 


PR02 822 


185 


76 


645 


AB031048 


Drosophila 
melanogaster 


microtubule associated- 
protein orbit 


738 


27 | 


646 


AF250842 


Drosophila 
melanogaster 


multiple asters 


834 


29 


647 


X86691 


Homo sapiens 


Mi- 2 protein 


10110 


99 


648 


U67934 


Homo sapiens 


homo log 


827 


96 


649 


AF236061 


Oryctolagus 
cuniculus 


rv a. ihvj xxngtsx" oinamg protein 


3330 


91 


650 


AL034553 


Homo sapiens 


uu jxMrzu . z i iv t aq protein 

act ivi ty- dependent 
neuroprotective protein 
(Adnp) ) 


5708 


100 


653 


X14766 ~~ 


Homo sapiens 


GABA-A receptor alpha 1 
subunit 


23 88 


99 


654 


AC004 614 


Homo sapiens 


similar to f-spondin proteins 
AB006086 (PID:g2529225) 


3026 


99 


655 


Y57908 


Homo sapiens 


Human transmembrane protein 
HTMPN-32. 


608 


99 


656 


Z34975 


Homo sapiens 


ldlCp 


3733 


100 


658 


AL050306 


Homo sapiens 


dCT475B7.2 (novel protein) 


1942 


99 


659 


W76734 


Homo 
sapiens 


Human mDia Rho targeting 
protein. 


781 


34 


660 


AF202724 


Homo sapiens 


Sadl unc-84 domain protein 1 


2172 


100 


661 


Z21966 


Homo sapiens 


mPOU homeobox protein 


1529 


100 


662 


AJ242954 


Mus mus cuius 


dysferlin 


4752 


59 


663 


AF182316 


Komo sapiens 


myoferlin 


6232 


99 


£65 


AL161516 


Aral>idopsi3 
thaliana 


hypothetical protein 


209 


30 


667 


X59303 


Homo sapiens 


valyl-tRNA synthetase 


3393 


99 


668 


Y133S5 j 


Homo sapiens 


Amino acid sequence of 
protein PR0220. 


3692 


100 


669 


AB010692 


Arabidopsis 
thaliana 


contains similarity to endo- 

beta-N-acetylglucosaminidase 

gene 


611 


52 


671 


X56123 


Kus musculus 


talin 


4474 


76 


672 


AB039371 


Homo sapiens 


mitochondrial ABC transporter 
3 


2902 


99 


673 


AF269223 


Homo sapiens 


TCP11 


806 


42 


674 


AF229633 


Mus musculus 


groucho- related protein 4 


4053 


99 


675 


1*14463 i 


Rattus 


' transduc'in 


3619 


92 
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676" 



67? 



678" 



SEQ 
ID 
NO: 



679 



680 



681 



682 



83 



684 



685 



687 



TABLE 2 



ACCESSION 
NUMBER 



AC005757 



S61069 



AF271388 



X79066 



AF1I8566 



Y51415 



AX.133S45 



Y86214 



SPECIES 



norvegicus 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



DESCRIPTION 



SMITH- 
WATERMAN 
SCORE 



R32611 1 



100 



reverse transcriptase 
homolog^pol (retroviral 
element } 



252 



65 



acetyl neuraminic acid 
synthase 



2273 



Mus musculus 



Homo 
sapiens 



Homo sapiens 



Homo sapiens 



Y94952 



Homo sapiens 



AL021878 



AE000198 



M58378 



Homo sapiens 



Escherichia 
coli 



ERP-1 



hematopoietic zinc finger 
protein 



1783 
769 



Human wild type pKe83 
protein. 



"2621 



bA386N14.1 (novel protein 
similar to a dual specificity 
phosphatase) 



700 



Nuclear transport protein 
clone hfb341 protein 
sequence . 



"5888 



Human secreted protein clone 
fhll6__ll protein sequence 
SEQ ID NO: 110. 



354 



dJ257I20.4 (transcription 
factor 20 (AR1) (KIAA0292) 
(isoform 2) ) 



154 



orf, hypothetical protein 



"628" 



Homo sapiens 



synapsin I 



3730 
508 



IDENTITY 



100 



100 



50 



99 



68" 



98 



67 



100 



9S 



688 



689 



690 



691 



692 



AF039697 



Homo sapiens 



antigen NY- CO- 31 



U09355 



Oryctolagus 
cuniculus 



protein phosphatase 2A1 B 
gamma 3ubuni t 



2356 



99 



AF155106 



Homo sapiens 



NY-REN- 3 6 antigen 



265 



50 



AC004774 



Homo sapiens 



Dlx-5 



1542 



100 



X90530 



Homo sapiens 



ragB 



1926 
1405 



99 
99" 



694 



695 



696 



698 



X90530 



Homo sapiens 



ragB 



X90530 



Homo sapiens 



agB 



1590 



G01563 



Homo sapiens 



Human secreted protein, 
ID NO: 5644. 



SEQ 



330 



100 



AC011810 



Arabidopsis 
tha liana 



Putative methionine 
aminopep tidase 



669 



52 



AJ250425 



Rattus 
norvegicus 



Collybi3tin I 



2455 



98 



AB037901 



Homo 
sapiens 



gene amplified in squamous 
cell carcinoma -1 



5364 



99 



699" 



701 



Y99401 



Homo sapiens 



Human PR01327 (UNQ687) amino 
acid sequence SEQ ID NO: 21 8. 



AF221712 



Homo 
sapiens 



Smad- and Olf- interacting 
zinc finger protein 



6705 



100 



702 



703 



704 



"705"" 



706 



707 



X83573 



Homo sapiens 



ARSE 



AJ243274 



Homo sapiens 



AP-2rep protein 



3184 
2078 



Y71262 



Homo sapiens 



Y71262 



Homo sapiens 



Y41257 



Homo sapiens 



Human chondromodul in - 1 ike I 1697 

protein , Zchml . 

Human chondromodul in- like 
protein, zchral . 



"99" 



94 



1736 



99 



Amino acid sequence of long 
human FAIM. 



1060 



100 



AL022237 



Homo sapiens 



bK1191B2.3 (PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2.7) (isoform 
1) ) 



2030 



100 



708 



709 



710 



711 



AJ006266 



Homo sapiens 



G01571 



Homo sapiens 



AND-1 protein 

Human secreted protein. 



SEQ 



777 



99 



ID NO: 5652. 



Y08698 



Homo sapiens 



ranbp3 



2849 



Y68770 



Homo sapiens 



Amino acid sequence ot a 
human phosphorylation 
effector PHSP-2 . 



754 



93 



160 



BNSDOCID: <WO_ 



0153312A1_I_> 



WO 01/53312 



TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
SCORE 


1 DENT I i Y 


712 


U93574 


Homo sapiens 


putative pl50 


799 




713 


ACO04531 


Homo sapiens 


Gene with similaity to DEAD 
box helicases 


2715 


99 


714 


D89016 


Homo sapiens 


Neuroblastoma 


538 


48 


715 


Y92175 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2. 


734 


98 


716 


AL137013 




phosphoribosyl tran f erase ) 


862 


100 


717 


AB035123 


Mus mus cuius 


a xyiia/ \j x id axpna/vjyi.o 
alpha synthase 


1696 


93 


718 


Y96290 


Homo >P40254 
P40254 25- 
OCT-1984 09- 
APR-1983 
Human IgD. 
I Homo 
sapiens 


Human IGFAM— 0 iirannnrvrl nhnl ■» t» 




B5 


719 


X07979 


Homo sapiens 


integrin beta 1 subunit 
precursor 


4347 


99 


720 


AJ224819 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


722 


W4 1565 


Homo 
sapiens] 
>W41564 
W41564 08- 
OCT-1997 05- 

ADD 1 Q a ~£ 

Human 

calpain. 

[Homo 


Human calpain. 


1591 


99 


723 


AF161341 


Homo sapiens 


HSPC078 


1097 


98 


724 


AF187318 


Homo sapiens 


F-box protein Fbx2 


1607 


100 j 


725 


AC006708 


Caenor habdi t 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB.-Z72876) 


1143 


46 


726 


AC0067O8 


Caenorhabdi t 


contains 3iralarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 

V VTC3 


988 


46 


727 


AC024818 


caenorhabdi t 
is elecans 


contains similarity to Pfam 
familv PPnn^nn fun rinmain 

1 1 1 w \J \J *m \J \J \ rvl— ' UUiUdlll^ 

G-beta repeat), score=81.8, 
E~1.4e-20, N=3 


950 


44 


728 


AJO05897 


Homo sapiens 


JMS 


831 


47 


729 


Y45377 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
27. 


908 


97 


730 


G03931 


Homo sapiens 


Human secreted protein, SEQ 
ID NO; 8012, 


578 


100 | 


731 


AB012720 


On corhyncbus 
ma sou 


GTP -binding protein 


3 865 


fO 


732 


W73404 


Homo sapiens 


encoded by Gene No . 8 . 


862 


97 


733 


G02650 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6731. 


644 


97 


734 


AC024 813 


Caenorhabdi t 
is elegans 


Hypothetical protein 
Y54FlOAL.a 


152 


24 


735 


AL0354 61 


Homo sapiens 


dJ967N21.6 f novel CDP- alcohol 
phosphatidyl transferase 
family member protein) 


1562 


98 


736 


UO0033 


Caenorhabdi t 
is elegans 


similar to S. cerevisiae YJU2 
protein 


605 


41 


737 


AF07909B 


Homo 
sapiens 


arginine- tRNA-protein 
transferase 1-lp; ATEl-lp 


2733 


99 



161 



BNSDOCID: <WO 0153312A1_U> 



WO 01/53312 



PCT/USOO/34263 



TABLE 2 



SEQ 
ID 
NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


738 


AJ131712 


Homo sapiens 


nucleolar RNA-helicase 


2793 


100 


739 


AJ133115 


Hoirto sapiens 




2054 


99 


740 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


953 


100 


741 


X98258 


Homo sapiens 




564 


74 


742 


U97191 


Caenorhabdi t 
is elegans 


el-v^nr* eimilarit-U t*Q the YPT1 

sub- family of RAS proteins 


960 


85 


743 


X76057 


Homo sapiens 


pnoSpnvJUlalllluf j,Duinci,a»>& 


2191 


100 


744 


G03209 


Homo sapiens 


u )in i3n qprrpt"pd Dfotein. SEQ 

xlLi.lt Idi l yiULtiH) 


496 


98 


745 


X97064 


Homo sapiens 


Sec23 protein L 


4034 


99 


746 


W93946 


Homo sapiens 


HRM-2 protein. 


994 


100 


747 


Y73388 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


1565 


99 


748 


M19529 


Sus scrota 


tollistatm A 


1906 


98 


749 


ACT249457 


Trichomonas 
vaginalis 


centrin, putative 


183 


28 


750 


AC004410 


Homo sapiens 


EOS39554 1 


2094 


100 


"751 


AF074968 


Homo sapiens 


p4 7ING3 protein 


2167 


100 


752 


AF252284 


Homo sapiens 


transcription specificity 
factor Spl 


4005 


100 


753 


AB049629 


Homo sapiens 


pho spho 1 y s i ne 

phosphohistidine inorganic 
pyrophosphate phosphatase 


1375 


99 


754 


D79205 


Homo sapiens 


ribosomal protexn I»39 


160 


"77 


755 


AB008430 


Homo sapiens 


CDEP 


142 


29 


758 


I»32162 


Homo sapiens 


transcription factor 


574 


80 


759 


AF037204 


Homo sapiens 


RING zinc finger protein 


- ->QC 




760 


Y44250 


Homo 
sapiens 


Human cell signalling 
protein- 13. 


625 


100 


761 


AF218586 


Homo sap i en 3 


Cide-b 


113 6 


100 


762 


U38934 


Gallus 
gallus 


his tone H2A 


625 


97 


763 


AF226053 


Homo sapiens 


HSKM-B 


606 


32 


764 


X13403 


Homo sapiens 


Oct-1 protein (AA 1 - 743) 


3626 




765 


D87446 


Homo sapiens 


Similar tc a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


568 


38 


766 


AL023828 


Caenorhabdi t 
is elegans 


Y17G7B. 14 




27 


767 


Y82777 


Homo sapiens 


Human chordin related protexn 
(Clone dw665_4) . 


2551 


99 


768 


X92475 


Homo sapiens 


ITBA1 


1429 


100 


769 


Y42752 


Homo sapiens 


Human calcium binding protein 
3 (caBP- s ) . 


1426 


100 


770 


X51416 


Homo sapiens 


hormone receptor hERRl (AA 1- 
521 ) 


2641 


97 


771 


AOF006591 


Homo s ap i ens 


cysteine-rjcn jj i. w 


1793 


100 


772 


A08695 


Homo s ap i ens 


rap2 


935 


100 


773 


Z12173 


Homo sapiens 


N - ace tylglu cos amine- 6 - 
sulphatase 


2970 


100 


774 


Y919S0 


Homo sapiens 


Human cytoskeleton associated 
protexn j \\».ioivir -j / • 


565 


43 


776 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


855 


56 


777 


AL.023799 


Homo sapiens 


CU32 ZV I . A \ZXnc J-j-nyti/ 


855 


56 


778 


rsoi 880 

wv tl» O © w 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5961. 


84 9 


98 


779 


AJ012590 


Homo sapiens 


glucose 1- dehydrogenase 


4155 


99 


780 


AL0785B2 


Homo sapiens 


dJ130E4.2 (KIAA0796) 


1321 


68 


781 


Z75955 


Caenorhabdi t 
is elegans 


similar to mitochondrial 
carrier protein 


384 


34 


782 


AL109965 


Homo 
sapiens 


dJ1121G12.2 (SCAN domain- 
containing 1 protein) 


900 


100 


783 


AF061262 


Mas 

musculus 


semaF cytoplasmic domain 
associated protein 2 


1316 


83 


784 


G03873 


Homo sapiens 


Human secreted protexn, SEQ 


649 


95 



162 



BNSDOCID: <WO 0153312A1_I_> 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DPQfD T DTTDM 


SMITH- 

U TV T C D TUt TV KT 

SCORE 


~T — 1 \ 

% 

IDENTITY 








ID NO: 7954 . 






785 


Y84441 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein . 


2074 


100 


786 


Y00918 


Homo sapiens 


Human Rab protein, RABP-l, 
protein sequence . 


104 8 


99 


787 


297029 


Homo sapiens 


ribonuclease HI large subunit 


154 8 


99 


788 


AB035384 


Homo sapiens 


SRp25 nuclear protein 


962 


S4t 


789 


AF024631 


Homo sapiens 


ANG2 


2644 


100 


790 


AJ"006710 


Rattus 
norvegicus 


phospha tidylinosi tol 3 -kinase 


4508 


97 


792 


V00638 


bacteriophag 
e lambda 


reading frame ealO 


600 


100 


793 


AF049103 


Homo sapiens 


Huntingtin interacting 
protein 


819 


100 


795 


Z26317 


Homo sapiens 


desmoglein 2 


4810 


99 


796 


Y76884 


Homo sapiens 


Drotein- 7sectu^nrp 


5080 


99 


797 


U15155 


Gallus 
gallus 


trypsinogen 


372 


37 


798 


U97189 


Caenorhabdi t 
is elegans 


strong similarity to thw 
A J / * a* tarnxxy or Kinases 


227 


28 


799 


AF112201 


Homo sapiens 


neuronal protein NP25 


1053 


100 


800 


AF234 765 


Rat hue; 

JVC* w I— \_i 


serine- arginine-rich splicing 
regulatory protein SRRP86 


958 


63 


801 


AF267852 


Homo sapiens 


placental protein 13-like 
protein 


743 


99 


802 


AF208851 


Homo sapiens 


BM-009 


766 


80 


803 


281097 ~" 


^•acuorxiduouj, c. 

is elejans 


Similarity to Human 
retinoblastoma -binding 

rjmhdiin DQAD/C <■ r\r- C C -v J3 i •*» c 

5 rri f^* a f rrim ^ > « Art a 
v-umca ix. utii luis gene 


152 


27 


804 


G02113 


Homo sapiens 


Human secreted protein, SEQ 

ID NO • £ 1 94 


496 


"98 


805 


AI*121673 


Homo sapiens 


bA305P22.1 {novel protein) 


1160 


ICO 


806 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 

\f *"* ^- V— JL 11 


264 


30 


807 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 


264 


3C 


808 


AB013885 


Homo sapiens 


beta-ureidopropionase 


14 94 


100 


809 


AF078842 


Homo sapiens 


HOT/Tli protein 


1581 


99 


810 


AF161421 


Homo sapiens 


HSPC303 


2134 


96 


811 


AF261689 


Homo sapiens 


DNA polymerase epsilon pi 7 
subunit 


734 


100 


~812 


Z74029 


Caenorhabdit 
is el eg axis 


Similarity to c. elegans 
f r om this g ene 


610 


71 


813 


Z73497 


Homo sapiens 


CU24 0C2.2 (Core histone 
H2A/H2B/H3 /H4 ) 


3 24 


100 


814 - 


K87689 


Homo 
sapiens 


Human HTXFTi 9 polypeptide. 


1464 


99 


815 " 


X16282 


Homo 
sapiens 


zinc finger protein (217 AA) 
{ 1 is 2nd base in codon) 


1109 


99 


816 


Z92539 


Mycobacteria 
m 

tuberculosis 


pth 


300 


36 


818 


AB030483 


Mus musculus 


B9 


197 


27 


819 


AL117S5S 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660J2, partial CDS 


865 


97 


821 


G03951 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8032. 


700 


99 


822 


K34807 


Musca 
domes tica 


transposase 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7009. 


558 


78 


824 


299531 


scnizosaccha 


cat feine- induced death 


184 


29 



163 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
SCORE 


X JJE.IM 1111 






romyces 
pombe 


protein 1 






825 


AJO06692 


Homo sapiens 


ultra high sulfer keratin 


693 


68 


826 


U23037 


Oryctolagus 
cuni cuius 


elF- 2Bepsilon 


3406 


on 


827 


G03412 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 74 93 . 


464 


100 


828 


Y30327 


Homo sapiens 


Human secreted protein 
encoded from gene 17 . 


113 


44 


829 


Y3 2199 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 
2022379. 


1012 


100 


830 


W78279 1 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33 . 


1264 


99 


832 


AB011542 


Homo sapiens 


MEGF9 


2097 


100 


833 


G02639 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6720. 


223 


70 


834 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1574 


100 


835 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1144 


89 


836 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1448 


94 


837 


X12517 


Homo sapiens 


C protein <AA 1-159) 


918 


100 


838 


U32865 


Drosophila 
melanogaster 


linotte protein 


164 


24 


839 


AF067730 


Homo sapiens 


TLS- associated protein TASR-2 


631 


56 


840 


U27831 


Homo sapiens 


striatum- enriched phosphatase 


2840 


98 i 


841 


AF286366 


Homo sapiens 


CamKI-like protein kinase 


1796 


100 


842 


G02309 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6390. 


278 


98 . 


843 


AE003615 


Drosophi la 
melanogaster 


ade3 gene product 


113 


48 


844 


G0135O 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5431. 


629 


100 


845 


U27838 


Mus musculu3 


glycosyl -phosphatidyl - 
inositol -anchored protein 
homolog 


3305 


96 


847 


Y87788 


Homo sapiens 


Human RBP-26 protein. 


2026 


100 


848 


AF164 794 


Homo sapiens 


Diff33 protein homolog 


2398 


100 


849 


U41315 


Homo sapiens 


ZNF127-Xp 


2458 


93 


850 


AF192784 


Homo sapiens 


makorin 1 


2062 


97 


85X 


Y58628 


Homo sap i en 3 


Protein regulating gene 
expression PRGE-21. 


1548 


100 


852 


Z22968 


Homo sapiens 


M130 antigen 


6205 


100 


853 


222971 


Homo sapiens 


M130 antigen extracellular 
variant 


6380 


100 


854 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 


330 


96 


855 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 


203 


100 


856 


AF285118 


Homo sapiens 


CGI-203 


452 


i nn 

1UU 

. . . ... 


857 


ACO06069 


Arabidopsis 
thaliana 


putative cleavage and 
polyadenylation specif ity 
factor 


1383 


3D 


858 


AJ_iU2154b 


Homo sapiens 


Lycocnronic uai uaoc 
Polypeptide Via -liver 
precursor (EC 1.9.3.1) 


593 


100 


859 


L02956 


Xenopus 
laevis 


ribonucleoprotein 


1664 


85 


860 


AF201947 


Homo sapiens 


MEK binding partner 1 


616 


100 


861 


L31783 


Mus musculus 


uridine kinase 


1266 


92 


862 


AF161472 


Homo sapiens 


HSPC123 


602 


73 


863 


Z49068 


Caenorhabdi t 
is elegans 


mitochondrial carrier protein 


370 


43 


864 


AF154108 


Homo sapiens 


tumor necrosis factor type i 


3559 


99 { 



164 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


865 


AE001530 


Helicobacter 
pylori J99 


receptor associated protein 
putative 


230 


32 


866 
867 


X57807 


Homo sapiens 


immunoglobulin lambda light 
chain 


699 


91 j 


868 


AMJ31673 
Y11652 


Homo sapiens 
Homo sapiens 


dtf694Bl4.1 (PUTATIVE novel 
KRAB box protein with 18 C2H2 
^/r^ ^j-tic linger domains J 
phosphate cyclase 


4066 


99 


863 

870 
871 


AF192968 

AB020648 
AL031427 


Homo sapiens 

Homo sapiens 
Homo sapiens 


high-glucose - regulated 
protein 8 
KIAA0841 protein 
dai67A19.l (novel protein) 


238 
3041 

3237 


100 | 
99 

59 


872 
873 

074 


AF151534 
AL021331 

X14608 


Homo sapiens 
Homo sapiens 

Homo sapiens 


core hzstone macroH2A2.2 
dJ366N23.1 (putative C. 
elegans UNC-93 (protein 1, 
C46F11.1) LIKE protein) 


1608 
1866 
1129 


100 ^ 
100 1 
100 ] 


875 
876 


ALU 733 4 


Homo sapiens 


propionyl-CoA carboxylase 
dJ687Fll.l (novel protein 
(part of translation of cDNA 
DKFZp434N061, Em:AL110249) ) 


3579 
306 


100 
100 


877 


X79489 


Saccharomyce 
3 cerevisiae 


E-925 protein 


446 


35 j 


878 
879 


Y53001 
AF2 31064 


Homo sapiens 
Homo sapiens 


Human secreted protein clone 
dn834 1 protein sequence SEQ 
ID NO: 8. 

"CKMPl.5 


811 
957 


100 

100 | 


880 
881 


X79417 
AF001317 


Sug scrofa 
Saccharomyce 
s cerevisiae 


4 OS rxbosomal protein S12 
Soilp 


687 
478 


100 1 

28 


882 


Y87275 
M14036 


Homo sapiens 
Homo sapiens 


Human signal peptide 
containing protein HSPP-52 
SEQ ID NO : 52 . 
CI -inhibitor 


2547 


100 


883 

884 
883 


AB041261 
AF020313 


Homo sapiens 
Mus nrusculus 


calcium- independent 
phospholipase A2 
proline -rich protein 4 8 


598 
"2903 

999 


77 J 
100 

84 J 


886 
887 


Y10936 
AF073997 


Homo sapiens 
Mus mus cuius 


hypothetical protein 

myotubularin related protein 

1 


1104 
866 


99 

36 j 


888 


Y57893 
AL117635 


Homo sapiens 
Homo sapiens 


Human transmembrane protein 
HTMPN-17. 

hypothetical protein ~ 


1099 


94 


' 889 
890 


AF210317 


Homo sapiens 


facilitative glucose 
transporter family member 
GLUT 9 


929 
2046 


99 j 
99 


891 


Y3 6 03 1 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO 
416. 


583 


100 


892 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO 
416. 


192 


57 


893 


AF237631 
AF090929 


Homo sapiens 
Homo sapiens 


ubiquitous tropomodulin U- 
Tmod 

PRO0477p — 


1798 


100 "j 


894 
895 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein 
BING4 (similar to S. 
cerevisiae YER0B2C, M. sexta 
MNG10 and C. elegans F28D1.1) 


653 

1 1 QC 
•J 170 


99 
100 


896 


AIi031228 
&F171102 J 


Htomo sapiens 

J 

Homo sapiens 


dJ1033B10.2 (WD40 protein 

BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 
retinal degeneration B beta 


2825 


96 


897 


AE003551 ] 
i 


tfrosophiia < 
Tie 1 anoga s t er 


3G18176 gene product " — "~ 


1302 
S3 3 


95 
13 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


UEb LK1 f L ±\Jr* 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


898 


AJ237 94 6 


Homo sapiens 


DEAD Box Protein 5 


2443 


100 


899 


Z97184 


Homo sapiens 


HKE2 


624 


100 


900 


Z97184 


Homo sapiens 


KKE2 


409 


98 


901 


AJ245587 


Homo sapiens 


Kruppel- type zinc finger 


1942 


100 


902 


AF09103 4 


Homo sapiens 


GTP- binding protein RAB22A 


1011 


100 


903 


R95953 


Homo sapiens 


Eukaryotic cell growth 
inhibiting factor. 


4 14 


96 


904 


1*04733 


Homo sapiens 


kinesin light chain 


1936 


72 


905 


AE003540 


Drosophila 
melanogaster 


CG10984 gene product 


446 


33 


906 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2993 


98 


907 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2901 


96 


908 


W84085 


Homo sapiens 


Human membrane fusion protein 
WDProl . 


1889 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interacting protein 


647 


100 


910 


AB029150 


Homo sapiens 


KRAB zinc finger protein 
HFB101L 


2196 


100 


911 


G02871 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6952 . 


521 


100 


912 


G03162 


Homo 3apiens 


Human secreted protein, SEQ 
ID NO: 7243. 


387 


87 


913 


AJ243721 


Homo 
sapiens] 
>Y92508 
Y92508 13- 
APR-2000 06- 
OCT-1998 
Human OXRE- 
5 . tHomo 
sapiens 


dTDP -4 -keto-6-deoxy-D- glucose 
4 -reductase 


1710 


100 


914 


U24189 


Caenorhabdit 
is elegans 


hypothetical protein 1207-1; 
Method: conceptual 
translation supplied by - 
authors 


244 

i 


41 


915 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


843 


99 


915 


AE000984 


Archaeoglobu 
s fulgidus 


dinitrogenase reductase 
activating glycohydrolase 
(draG) 


17X 


26 


913 


M23159 


Cricetua 
cricetus 


DHFR-coamplif ied protein 


163 


30 


919 


1*12018 


Cae norhabdi t 
is elegans 


putative 


1232 


41 


920 


AF102177 


Homo sapiens 


tumor antigen SI*P-8p 


^7=-R 

1260 


97 


921 


AL096712 


Homo sapiens 


dJ744I24.2 (similar to a 
novel human gene mapping to 
Activator) 


1017 


78 


922 


AL161495 


Arabidopsis 
thaliana 


putative WD- repeat protein 


866 


42 


923 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


A AO 


36 


924 


U97001 


Caenorhabdi t 

-4 <» al prranc 

-i, C JL alia 


similar to 

Schizosaccharomyces pombe 




51 


925 


X71978 


Mus musculus 


Fi£ 


1503 


95 


926 


K9228 8 


Drosophila 
melanogaster 


beta- spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No. 9. 


1392 


100 


928 


Y22499 


Homo sapiens 


Human secreted protein 
sequence clone mh703__l . 


2249 


100 


930 


AJ224326 


Homo sapiens 


ribulose- 5 -phosphate- 
epimerase 


912 


100 


931 


U28991 


Caenorhabdi t 


coded for by C. elegans cDNA 


660 


55 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES " 
is elegans 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


932 
933 


AL080065 
G01384 


Homo sapiens 
Homo sapiens 


hypothetical protein 

Human secreted protein, SEQ 

ID NO: 5965. 


210 
767 


25 
98 


934 


AJ276485 


Homo sapiens 


integral membrane transporter 
protein 


1200 


100 


935 

936 
937 


AL035681 

AB026808 
AB015345 


Homo sapiens 

Mus mus cuius 
Homo sapiens 


cLr/56G23.3 (novel protein 
similar to drosophila 
transcriptional repressor) 
synaptotagmin XI 
HRIHFB2216 


1142 
2142 


80 

95 


938 
939 


X65724 
W89024 


Homo sapiens 

Hnirin eawi avin 


0RF2 

Polypeptide fragment encoded 
by gene 156. 


2601 

498 

1487 


99 

100 

100 


940 


G04047 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8128 . 


117 


100 


941 


AF094583 


Homo sapiens 


putative HlV-i infection 
related protein 


452 


100 


942 
943 


AC024200 
AF129756 


Caenorhabdit 
is elegans 

Homo sapiens 


contains similarity to 
several zinc finger proteins 
but not to the zinc finger 
domains 
G5c 


350 




944 
945 


K23765 


Rattus 
norvegicus 


alpha- tropomyosin 


273 
133 


100 
96 


94 6 
94 7 
94 8 


AC009917 

AF223468 
AF055473 


Arabidopsis 
thaXiana 
Homo sapiens 
Homo sapiens 


Contains similarity to 

AD021 protein 
GAGE - 8 


583 

551 
273 


47 

44 
51 


949 
950 


X7S756 

AF143956 

Y36729 


Homo sapiens 
Mus tnusculus 
Homo 
sapiens 


protein kinase C mu 
corcnin-2 

Human PG1 protein sequence. 


2019 
23 00 
1861 


68 
93 
99 


951 
952 


W49041 


Homo sapiens 


Human low density lipoprotein 
binding protein IiBP-2. 


232 


67 




AB016881 


Arabidopsis 
thaliana 


gene^id :MXC17 . 7~ 


2 03 


46 


953 
954 


Y01785 


Homo sapiens 


Human ubiqui tin- conjugating 
enzyme >Y25341 Y25341 01 -JUL - 
1999 12-AUG-1998 Human NCE-2 
protein. 


365 


100 


955 


AF14S615 
U09410 


melanogaster 
Homo sapiens 


xiCt?NA.GH03377 

zinc finger protein ZNF131 


8 23 
2483 


46 
99 


956 
957 

958 


U09410 
AF195623 

X94917 j 


Homo sapiens 
Homo sapiens 


z * nc finger protein ZNF131 
choi inephosphotransf erase 1 
alpha 


1853 
2126 


99 
99 


959 




Drosophila 
melanogaster 


head-elevated expression in 
0.9 kb 


155 


32 


960 


U54807 
AF058807 


Rattus 
norvegicus 
Bos taurus 


GTP-binding protein 


1167 


97 


961 
962 


G03244 
AF07B850 


Homo sapiens 
Homo sapiens 


GTP -binding protein rah 
Human secreted protein, SEQ 
ID NO: 7325. 


606 

471 T 


97 

100 ~~ 


963 
964 


AP001754 


Homo sapiens 


steroid dehydrogenase homolog 
transient receptor potential- 
related channel 7, a novel 
putative Ca2+ channel protein 


5B3 
317 


40 
30 


965 


AL035419 


Homo sapiens 


OaJ1100H13.Z (putative novel 

protein) 


1129 j 


100 


966 


X61381 


Rattus i 
rattus 


interf eron-induced protein 


202 


46 


967 


D38169 


Homo 
sapiens 


inositol 1,4 , 5-trisphosphate 
3 -kinase isoenzyme 


3278 


100 




AL031432 


Homo 

sapiens ] 


AJ4 6 5N24.2.1 (PUTATIVE novel 
protein) (isoform 1) 


893 


100 



167 



BNSDOCID: <WO 0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 


968 


U79275 


Homo sapiens 


unknown 


611 


100 


969 


AJ011306 


Homo 
sapiens 


guanine nucleotide exchange 
factor {long isoform) 


2752 


99 


970 


AF281134 


Homo sapiens 


exosome component . Rrp4 6 


1186 


100 


971 


U53336 


Caenorhabdit 
is elegans 


weak similarity over a short 
region to myosin heavy chain 


536 


23 


972 


AC018749 


Leishmania 
major 


L8840.12 


589 


53 


973 


AF188504 


Mus mus cuius 


LNV 


544 


85 


974 


U25801 


Homo sapiens 


Taxi binding protein 


852 


98 


975 


AF049523 


Homo sapiens 
1 


hunt ingtin- interacting 
protein HYPA/FBP11 


1390 


97 


976 


AF161530 


Homo sapiens 


HSPC182 


1040 


100 


977 


G04020 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8101. 


626 


100 


978 


AF164797 


Homo sapiens 


ribosomal protein L17 isolog 


908 


100 


979 


U94991 


Xenopus 
laevis 


transcription factor XIMOl 


795 


97 


980 


S73775 


Homo sapiens 


calmitine; calsequestrine 


2029 


100 


981 


Y94888 


Homo 
sapiens 


Human protein clone HP01462. 


2501 


100 


982 


AJ243I91 


Homo sapiens 


heat shock protein 


827 


96 


983. 


X65020 


Bos taurus 


PSST subunit of the NADH: 
ubiquinone oxidoreductase 
complex 


964 


85 


984 


AJ249207 


Rhodococcus 
Sp. AD45 


putative racemase 


351 


43 


985 


Z30093 


Homo sapiens 


basic transcription factor 2, 
35 kD subunit 


1576 


99 


986 


ABO30835 


Homo sapiens 


contains two glutamine rich 
domains, three zinc- finger 
domains, and matrin 3 
homologous domain 3 (MH3) 


4697 


99 


987 


AF227258 


Bos taurus 


RPGR- interacting protein- 1 


1262 


38 


988 


AL022238 


Homo sapiens 


dJ1042Kl0.2 (supported by 
GENS CAN, FGENES and GENEWISE) 


4048 


99 


989 


AL02223B 


Homo sapiens 


dJ1042K10.2 {supported by 
GENS CAN, FGENES and GENEWISE) 


2321 


99 


990 


AF16142S 


Homo sapiens 


HSPC308 


448 


92 


991 


AF161426 


Homo sapiens 


HSPC308 


448 


92 


992 


AF161426 


Homo sapiens 


HSPC308 


453 


92 


993 


AL023859 


Schizosaccha 

romyces 

pombe 


trna- splicing endonuc lease 
subunit 


172 


42 


994 


AL049631 


Homo sapiens 


dJ5l3M9.1 {novel Homeobox 
domain protein) 


241 


47 


995 


AC0C5253 


Homo sapiens 


R26445 1 


902 


100 


996 


AF265206 


Homo sapiens 


MOG1 isoform A 


974 


100 


997 


AJ248285 


Fyrococcus 
abyssi 


sar cosine oxidase, subunit 
beta (soxB) 


195 


28 


998 


AE003641 


Drosophila 
icelanogaster 


BG:DS00941.3 gene product 


218 


58 


999 


W69343 


Homo 
sapiens 


Secreted protein of clone 
CR930 1. 


1340 


99 


1000 


AY007135 


Homo sapiens 


similar to bovine ADP/ATP 
translocase Tl mRNA with 
GenBank Accession Number 
M24102.1 


1543 


100 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence . 


1668 


100 


1002 


AF208844 


Homo sapiens 


BM-002 


428 


100 


1003 


AE004944 


P s eudomona s 
aeruginosa 


hypothetical protein 


134 


35 


1004 


AL031431 


Homo sapiens 


dJ462023.2 (novel protein) 


2058 


100 


1005 


S45367 


Can is 
familiar is 


cent rac tin 


1949 


100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESS IOK 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


1006 


S45367 


Can is 

familiaris 


centractin 


1315 


98 


1007 


AB022158 


Mus 

musculus 


chaperonin containing TCP-1 
epsilon subunit 


2649 


96 


1008 


Y76332 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 38. 


" 1282 


97 


1009 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1010 


Z68218 


Caenorhabdit 
is elegans 


K01H12.1 


269 


67 


1011 


AB011414 


Homo sapiens 


Kruppel-type zinc linger 
protein 


1671 


58 


1012 


Z14000 


Homo sapiens 


RING1 


2017 


100 


1013 


" G02841 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6922. 


332 


93 


1014 


AF145659 


Drosophila 
melanogaster 


Be DM A . GH1 0333 


1244 


52 


1015 


Y02860 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


664 


67 


1016 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


772 


97" 


1017 


Y99448 


Homo sapiens 


Human PR01759 (UNQ832) amino 
acid sequence SEQ id NO: 374. 


2323 


n n ft 


1018 


X67250 


Rattus 
norvegicus 


n-chimaerin 


"1710 


97 


1019 


AT183417 


Homo 
sapiens 


microtubule- associated 
proteins 1A/1B light chain 3 


631 


100 


1020 


AF16479S 


Homo sapiens 


sex-regulated protein janus-a 


674 


J.UU 


1021 


AF190625 


Cotumix 
coturnix 


qdgl - 1 " 


638 


96 


1022 


AL133363 


Arabldopsis 
thaliana 


putative protein 


155 


1 "7 


1023 


AB034912 


Homo sapxens 


WD- repeat like sequence 


24 83 


100 


1024 


AY007091 


Homo sapiens 


similar to Homo sapiens 
mammalian inositol 
hexakisphosphate kinase 2 
(IP6K2) mRNA with Ge 


2243 


100 


1025 


X69910 


Homo sapiens 


P63 protein 


2958 


99 


1026 


U80736 


Homo sapiens 


CAGF9 


1657 


T ft ft 


1027 


AB029333 


Halocynthia 
roretzi 


HrPET-1 


1048 


54 


1028 


AB032931 


Homo sapiens 


ubiquit in- conjugating enzyme j 
isolog 


1045 


100 


1029 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1030 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1031 


AF193 795 


Homo sapiens 


vacuolar sorting protein 
VPS29/PEP11 


960 


100 


1032 


AJ222968 


Mus musculus 


L-periaxin 


120 


30 


1033 


Z81317 


Schizosaccha 

romyces 

pombe 


DNA2-NAM7 helicase family 
protein 


635 


31 


1034 
1035 


Y41519 
AJ276004 


Homo sapiens 
Mus musculus 


Fragment of human secreted 
protein encoded by gene 75. 
Paxneb protein 


1321 


99 


1036 


AF025459 


Caenorhabdit 
is elegans 


H14A12.3 gene product 


1709 
190 


77 
30 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc ringer 
protein; this is a splicing 
supplied by author 


196 


43 


1038 


W74580 


Homo ! 
sapiens 


Human membrane protein \ 
BA0306. 


1921 


97 


1039 


U88173 


Caenorhabdit 
is elegans 


weak similarity to 
Arabidopsis thaliana 
ubiquitin-like protein 8 


331 


80 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRI PTION 


CMTTU 

WATERMAN 


IDENTITY 


1040 


AF290204 


Homo sapiens 


, _ 

blood group carrier molecule 

DOK1 


IOJ / 


99 


1041 


Y9673 0 


Homo 
sapiens 


PR0539, a Costal -2 homologue. 


162 


22 


1042 


AF140683 


Mus musculus 


F-box protein FWD2 


2397 


98 


1043 


AF151023 j 


Homo sapiens 


HSPC189 


1104 


100 


1044 


AF181631 


Drosophila 
melanogaster 


BcDNA.GH04 929 


"3 Ail 


37 


1045 


Y77985 


Homo sapiens 


Human collectin amino acid 
sequence . 


1940 


100 


1046 


AJ243972 


Homo sapiens 


6-phosphogluconolactonase 


1317 


100 


1047 


AB035863 


Homo sapiens 


ATP specific succinyl CoA 
synthetase beta subunit 
precursor 


2324 


99 


1048 


AL034550 


Homo sapiens 


dJ1184F4.2 (novel protein 
similar to nucleolar protein 
4 (NOL4) (NOLP) ) 


981 


92 


1049 


AF163825 


Homo sapiens 


pre~B lymphocyte protein 3 


634 


100 


1050 


AP201949 


Homo sapiens 


60S ribosomal protein Jj30 
isolog 


868 


100 


1051 


AF190624 


Mus musculus 


mdgl-l 


236 


85 


1052 


AE003529 


Drosophila 
melanogaster 


CG6151 gene product 


160 


44 


1053 


G01191 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5272. 


646 


98 


1054 


Al,162756 


Neisseria 
meningitidis 


Glu- tRNA(Gln) 

araidotransf erase subunit A 


682 


44 


1055 


AF181856 


Rattus 
norvegicufl 


tRNA selenocyoteine 
associated protein 


1S25 


99 


1056 


U89649 


Chi araydomona 
s 

reinhardtii 


Mrl9,000 outer 3rni dynein 
light chain 


244 


34 


1057 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


663 


53 


1058 


AF230929 


Homo 
sapiens 


keratinocyte annexin-lilce 
protein pemphaxin 


1710 


99 


1059 


AJ270952 


Homo sapiens 


putative membrane protein 


1363 


100 


1050 


AF224263 


He t e r odon tu s 
franc isci 


HoxD8 


742 


~5ri — 

83 


1061 


X63417 


Homo sapiens 


IRLB 


1037 


100 


10S2 


AL079345 


Streptomyces 
coeli color 
A3 (2) 


hypothetical protein 


143 


27 


1063 


Y71112 


Homo sapiens 


Human Hydrolase protein- 10 
(HYDRL-10) . 


2547 


100 


1064 


AF263 614 


Homo sapiens 


acetyl -CoA synthetase 


3493 


99 


1065 


Y13356 


Homo sapiens 


Amino acid sequence of 
protein PRQ221 . 


1363 


1O0 


1066 


AC006153 


Homo sapiens 


similar to Aquifex aeolicus 
GTP-binding protein; similar 
to AE000771 <PID:g2984292) 


662 


no 

y o 


1067 


Y18930 


Sulfolobus 
solfataricus 


hypothetical protein 


162 


O Q 


1068 


R65969 


Homo 

sapiens T98G 


Gl iobla s toma - derived 
polypeptide . 


8 87 


i nft 

IUU 


1069 


Y07964 


Homo sapiens 


Human Deere ted protein 
fragment 


863 


96 


1070 


AF177476 


Rattus 
norvegicus 


CDK5 activator -binding 
protein 


1995 


86 


1071 


AF24550S 


Homo sapiens 


adlican 


3109 


99 


1072 


U92794 


Mus musculus 


alpha glucosidase II, beta 
subunit 


147 


36 


1073 


G03889 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7970. 


698 


98 


1074 


U15779 


Homo sapiens 


p70 


380 


28 


1075 


Y13392 


Homo sapiens 


Amino acid sequence of 


1271 


91 
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TABLE 2 



SEO 
ID 
NO: 


/""Vfvf 

n \_ r.£> JUJJN 

NUMBER 


&PECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1076 


AF161457 


Homo sapiens 


protein PR0328 . 
HSPC339 






1077 
1078 


Y79509 
AF223466 


Homo sapiens 
Homo sapiens 


Human carbohydrate -associated 
protein CRBAP-5. 
htois protein 


571 
21S1 


100 1 
98 ■ 


1079 
1080 


AL13296S 
AB024937 


Arabidopsis 

tha 1 i ana 

Homo sapiens 


putative WD-40 repeat-protein 
LUNX 


831 
286 


66 
29 


1081 


Y14768 


Homo sapiens 


V-ATPase G-subunit like 
protein 


1284 
579 


100 
100 


10B2 


AF0164I6 


Caenorhabdi t 
is elegans 


F29A7.4 gene product 


141 


31 


1083 
1084 


L13291 
AB041541 


Homo sapiens 
Mus musculus 


ADP-nbosylarginine hydrolase 
unnamed protein product 


802 


45 


1085 
1086 


G01922 
AB030814 


Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ 
ID NO : 6003 . 

H-REV107 protein homolog 


151 
" 202 


44 

97 


1087 


AF151638 


Homo sapiens 


phosphatidylcholine transfer- 
protein 


833 
1142 


100 
100 


1088 


Y84432 


Homo sapiens 


Amino acid sequence of" a 
human RNA- associated 
protein. 


2783 


100 


1089 

1090 
1091 


Y94867 

AK023 982 
AB041586 


Homo 
sapiens 
Homo sapiens 
Mus musculus 


Human protein clone HP10563. 
unnamed protein product 


613 
130 


100 
49 


1092 
1093 


Y71277 
U34973 


Homo sapiens 
Mus musculus 


unnamed protein product 
Human Zlipo3 protein, 
protein tyrosine phosphatase - 
like 


1103 

606 

1131 


81 

100 

95 


1094 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828 . 


522 




1095 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 


1029 


99 


1096 
1097 


Y87276 
AF161455 


Homo sapiens 
Homo sapi ens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 
HSPC337 


863 


98 | 


1098 
1099 


U80029 
AJ005866 


Caenorhabdi t 
is elegans 
Homo sapiens 


similar to thioredoxin 
Sqv-7-Iifce protein 


742 
242 

1321 


98 
39 

99 


1100 
1101 
1102 


AJ005865 
AJ005866 
>wuui bos 


Homo sapiens 
Homo sapiens 
Homo sapiens 


Sqv-7-like protein 
Sqv-7-like protein 
Sqv-7-liJce protein 


1118 

891 

1016 


99 
99 
99 


1103 
1104 


AL110244 
AF242194 


Homo sapiens 

Drosophila 

melanogaster 


hypothetical protein 
brakeless-B 


299 
147 


31 


1105 


AL031010 


Homo sapiens 


dJ4 22F24.1 (PUTATIVE novel 
protein similar to C. elegans 
C02C2.5) 


968 


100 


1106 
1107 


U28016 
AJ278150 


Mus musculus 
Homo sapiens 


parathion hydrolase 
(phosphodiesterase) -related 
protein 


1624 


87 


1108 


G03733 


Homo sapiens 


putative lipid kinase 
Human secreted protein, SEQ 
ID NO: 7814. 


2207 
495 


99 
98 


1109 


-ftr £■ X / Z a 1 


Drosophila 
melanogaster 


G protein RhoBTB 


834 


54 


1110 

lin 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


941 


48 




Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


51 


1112 


AF176704 


Homo sapiens 


F-box protein FBX9 


2027 


99 


1113 
1114 


AF182076 
G04039 


Homo 
sapiens 
Homo sapiens 


glioma tumor suppressor 
candidate region protein 2 
Human secreted protein, SEQ 


2418 
475 


100 
96 
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TABLE 2 



SEQ r ACCESS ION 
ID i NUMBER 
NO: 



SPECIES 



1115 



1116 



1117 



AF229439 Mus niusculus 



L40357 



Homo sapaens 



1.40357 



Homo sapiens 



DESCRIPTION 



ID NO: 8120. 



SMITH- 
WATERMAN 
SCORE 



zinc finger protein 289 



1697 



thyroid receptor interactor 



509 



thyroid receptor interactor 



04 



% 

IDENTITY 



100 



85 
100 



1118 



A12155 



Homo sapiens 



Human X5L cDNA. 

isomerase like protein 



673 



1119 



AL161542 



Arabidopsis 
tha liana 



53 



1120 



1121 



1122 



1123 



1124 



AL023754 



Homo sapiens 



Y57901 



Homo sapiens 



dJ272Ll6.1 (Rat 

Ca2+ /Calmodulin dependent 

Protein Kinase LIKE p rotein) 

Human transmembrane protein 

KTMPN-25. 



2341 



321 



Z14122 



Xenopus 
laevis 



XLCL2 



455 



AF225418 



Homo sapiens 



Y06518 



Homo sapiens 



l ipase 

Zen GTPase interacting 
protein ZIP. 



1531 



3227 



36 



77 



97 



100 



1125 



1126 



1127 
1128 



AL03S690 



AJ000217 



Homo sapiens 
Homo sapiens 



dJ202I2l.l (novel protein) 



952 



AB03O5O5 



Mus musculus 



CLIC2 
UBE 



lc2 



1069 



Y73375 



Homo sapiens 



HTRM clone 1427838 protein 
sequence . 



Cyclophilin-type pep t idyl 
prolyl cis/ trans isomerase 



100 



99 



1129 



1130 
1131 



1132 



1133 



Y78941 



Homo sapiens 



877 



amino acid sequence 



AL023553 



Homo sapiens 



Y91945 



Homo sapiens 



Z68197 



Schi zosaccha 
romyces 
pombe 



Z68197 



Schi zosaccha 
romyces 
pombe 



dJ347H13.4 (novel protein) - 



Human chaperone protein 6 
(HCHP-6) . 



557 



1408 



putative nuclear pore protein 



596 



putative nuclear pore protein 



389 



100 



100 



100 



39 



35 



1134 1 AF180681 I Homo sapiens 



guanine nucleotide exchange 
factor 



3597 



100 



1135 



1136 
"1137 



1138 



1139 



1140 



AF079765 



Mus musculus 



enhancer of polycomE"" 



264 



M62419 



Mus musculus 



clathrin-associated protein 



2189 



AJ006219 



Drosophila 
melanogaster 



clathrin-associated protein 



1254 



Y76218 



Homo sapiens 



Human secreted protein 
encoded by gene 95. 



440 



W88104 



Homo 
sapiens 



A Rab protein 
HRABS-2. 



designated 



1065 



Y13401 



Homo sapiens 



Amino acid sequence of 
protein PRQ339. 



3979 



41 



99 



78 
98" 



98 



1141 W85026 



1142 



Chimeric - 
Homo sapiens 



Green fluorescent protein- 
Zap70 fusion product . 



Y13402 



Homo sapiens 



Amino acid sequence of 
protein PR0310. 



1694 



100 



99 



1143 



1144 



1145 



G03 875 



Homo sapiens 



Human secreted protein, 
ID NO: 7956. 



SEQ 



660 



Y12917 



Y12917 



Homo sapiens 
"Homo sapiens 



Amino acid sequence of a 
human secreted peptide. 



750 



Amino acid sequence of a 
human secreted peptide. 



1096 



99 



98 



100 



1146 AL022157 Homo sapiens 



SPIN (SPINDLIN HOMOLOG 
{PROTEIN DXF34) ) 



1233 



100 



1147 



AL022157 



Homo sapiens 



SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34) ) 



1233 



100 



1148 



1149 



G02548 



Homo sapiens 



Human secreted protein, 
ID NO: 6629. 



SEQ 



Y73338 



Homo sapiens 



HTRM clone 2019742 protein 
sequence . 



3 70 

1492 

228 



93 



100 

~S5~~ 



1150 



W74841 



Homo sapiens 



Human secreted protein 
encoded by gene 113 clone 
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TABLE 2 



ID 

NO: • 




SPEC lr.S 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 














1151 


AF044201 


Rattus 

"UL Vcy Xl<Ub 


neural membrane protein 35; 


1570 


92 


1152 


AF1S6774 


Homo 

earn 


lysophosphatidic acid 
acyl t transferase— gamma 1 


1855 


99 


1153 


Alii -L O «-> J. 


Homo sapiens 


ajiiyiNib ,1 \fi novel protein 

\ i-i.ciii^xa txun i. L. J its iJAMrt 

DKFZp566A094 6, Em:AL050069) ) 


872 


64 


1154 


AF131852 


Homo sspiffns 


w J irv 1 1UWI 1 


473 


100 


1155 


Y41705 


Homo 
sapiens 


Human PR03S2 protein 
sequence . 


1381 


97 


1156 


oUi U JO 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: B117. 


607 


99 


1157 


AF11244 4 


Lupinus 
luteus 


Ij-asparaginase 


287 


43 


11 58 


HPT ci QAP 


Homo sapiens 


CGI- 90 protein 


232 


32 






Homo sapiens 


choline dehydrogenase 


2449 


100 


1160 


AB001773 


Ciona 
savignyi 


PEM-6 


\9G 


33 


lib! 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


83 


1162 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


83 


11 CI 


At X ± .3 3 


Homo sapiens 


HP1-BP74 protein 


2723 


96 


1164 


AF232226 


Danio rerio 


Deddl 


191 


41 


1165 


ALill8501 


Homo sapiens 


dJll91N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069) ) 


1051 


71 


1166 


AL118501 


Homo sapiens 


dJH91N16.1 (A novel protein 
(translation of the cDNA 
DXFZp566A0946 , Em; AL050069) ) 


945 


75 


1167 


rtV XD # fjJ 


— — . 

Homo sapiens 


syn t aphi 1 i n 


831 


42 


1168 


AB019435 


Homo sapiens 


phosphol ipase 


951 


55 


1169 


ArUb4b \J*± 


Homo sapiens 


KE03 protein 


324 


33 


1170 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
by gene 6. 


1191 


100 


1171 


L03188 


Saccharomyce 
s cexevisiae 


putative 


180 


22 


1172 


AF113 751 


Mus musculus 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


1173 


AJ24S417 


Homo sapiens 


GSb protein 


794 j 


100 


1174 


AL022238 


Homo sapiens 


dJl042K10.3 (novel protein) 


1285 


100 




U4 1278 


Cae nor h abd i t 
is elegans 


F33G12.3 gene product 


332 


28 


1176 


M35617 


Homo sap i ens 


T-cell receptor V- alpha -J- 
alpha region 


284 


83 


1177 


AC012680 


Arabidopsis 
thaliana 


putative protein phosphatase 
2C; 55455-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5426. 


692 


99 


1179 


AL096767 


Homo sapiens 


dJ579N16.3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


1180 


AF039716 


is elegans 


similar to ATP synthase B 
chain 


4 96 


55 


1181 


Y11710" 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


X82240 


Homo 
sapiens] 
>R94 974 
R94974 09- 
MAY-1996 27- 
OCT- 1994 
Human TCL-1 
polypeptide . 


T cell leukemia/ lymphoma 1 


617 


100 
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TABLE 2 



1 SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 






[Homo 
sapiens 








1 1183 
j 


U42B41 


Caenorhabdit 
is elegans 


short region of weak, 
similarity to collagen 


161 


33 " 




1 1185 


AJ131613 


Homo sapiens 


dicarboxylate carrier protein 


1470 


99 


1186 


L27645 


Danio rerio 


growth- associated protein 


130 


36 


1187 


Y02738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 
HLHFP03 . 


636 


100 


1188 


AF21754 4 


Xenopus 
laevis 


ornithine decarboxylase -2 


1459 


60 


1189 


AL136307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


182 


33 

ioo i 


1190 


X89602 


Homo sapiens 


rTSbeta 


197 




1191 


U32828 


Haemophilus 

influenzae 

Rd 


ribosomal protein S6 
modification protein (rimK) 


268 


31 


1192 


AF154831 


Rattus 
norvegicus 


PV-1 


1403 


60 


1193 


Y50926 


Homo sapiens 


Human fetal bram cDNA clone 
vcl6 1 derived protein. 


918 


100 


1194 


AF026530 


Rattus 
norvegicus 


stathmin- like -protein splice 
variant RB3 ■ ' 


1093 


97 


1195 


U35244 


Rattus 
norvegicus 


vacuolar protein sorting 
homo log r-vps33a 


2981 


96 


1196 


Y70470 


Homo sapiens 


Human p53 target molecule, 
PRG3 protein. 


1680 


100 


| 1197 


AF157318 


Homo sapiens 


AD- 017 protein 


912 


47 


1198 


AF125443 


Caenorhabdit 
is elegans 


contains similarity to S. 
pombe phosphatidyl synthase 
(GB:Z2 82 95) 


460 


39 


1199 


AF2 01934 


Homo sapiens 


DC12 


1649 


88 


1200 


AL031775 


Homo sapiens 


dJ30M3.3 {novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


100 


1201 


M21103 


Ovis aries 


BIIIB4 high-sulfur keratin 


484 


82 


I 1202 


285986 


Homo sapiens 


dJ108K11.3 (similar to yeast 
suppressor protein SRP40) 


1143 


75 


I 1203 


U18762 


Rattus 
norvegicus 


retinol dehydrogenase type I 


890 


52 


1204 


U35730 


Mus musculus 


jerky 


2235 


76 


1205 


ABO02327 


Homo sapiens 


KIAA0329 


151 


24 


1206 


AB019233 


Arabidopsis 
thaliana 


ubiquinone/menaquinone 

biosynthesis 

methyl transf erase-like 


762 


56 


1207 


AI»13 6307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurit© outgrowth) 


742 


100 


1208 


AF207989 


Homo sapiens 


orphan G-protein coupled 
receptor 


2326 


100 


1209 

1 


Z97630 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to AKK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G) ) ) 


181 


44 


1210 


U21549 


Mus musculu3 


Ac3 9/ physophi 1 in 


1280 


68 


1211 


Y27700 


Homo sapiens 


Human secreted protein 
encoded by gene No. 12- 


1267 


100 


1212 


AF117814 


Mus musculus 


odd- skipped related 1 protein 


945 


66 


1213 


AF277233 


Naegleria 
fowleri 


calcineurin B 


222 


39 


1214 


D1484 9 


Mus musculus 


meiosis- specif ic nuclear 
structural protein 1 


1950 


77 


1215 


G03022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103. 


590 


100 


J 1216 


Z72510 


Caenorhabdit 


similarity to yeast UTR3 


634 


49 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- . 
WATERMAN 
SCORE 


IDENTITY 






is elegans 


protein (Swiss Prot accession 

vk.6 77hll S rnmf «; frnm hViq 

gene 






1217 


249703 


Saccharorayce 
s cerevisiae 


unknown 




22 


1218 


AC013 43 0 


Arabidopsis 
thaliana 


F3F9.18 ~ ~" 


199 


29 


1219 


L10910 


Homo sapiens 


splicing factor 


XUZo 


71 


1220 


Z70750 


Caenorhabdi t 
is elegans 


similar to vanadate 
resistance orotpin 
t ransmembranous comes from 
this gene 


965 


SB 


1221 


AL163815 


Arabidopsis 
thaliana 


putative protein 


O jj 


61 


1222 


AP155100 


Homo sapiens 


zinc finger protein KY-REN-21 
antigen 


2261 


100 


1223 


J05071 


Bos taorus 


GTP— binding regulatory 
protein gamma -6 subunit 


O 3D 


100 


1224 


Y73364 


Homo sapiens 


HTRM clone 2765991 protein 
sequence . 


1169 


99 


1225 


AL050170 


Homo sapiens 


hypothetical protein 


714 


100 


1226 - 


X64002 


Homo sapiens 


RAP74 


2661 


99 


"1227 


X04085 


Homo sapiens 


catalase 


2846 


100 


1228 


AOT005620 


Mus musculus 


skeletal muscle- specif ic gene 


1416 


90 


1229 


AF045564 


Rat t us 
norve g i cu s 


development-related protein 


1715 


93 


1230 


X97571 


Mus musculus 


HCMV- interacting protein 


479 


96 


1231 


L08239 


Homo sapiens 


located at OATLl 


2274 


100 


1232 


-X U U -> 


Homo sapiens 


sorting nexin 14 


1964 


100 


1233 


x ^ j. o y ^ 


Homo sapiens 


sorting nexin 14 


1203 


84 


1234 


" O \J 


v— denor naoci jl u 


contains similarity to 


744 


31 


1235 


AC006634 


Caenorhabdit 


contains similarity to 
Saccharomyces cerevisiae 
probable membrane protein 
VTiP/i i tax*. • Tnm o\ 


357 


33 


1236 


Y18101 


Mus musculus 


macrophage act in-associated- 

y l wo j. i it; -pnospnoryiaccfl 
protein 


1559 


87 


1237 


AB042646 


Homo sapiens 


TGIF2 


1224 


100 


• 1238 


AB026264 


Homo sapiens 


IMPACT 


1694 


100 


1239 


AB026264 


Homo sapiens 


IMPACT """ ~~~~ — ' 


1123 


100 


1240 


G00429 


Homo sapiens 


Human seoif^t'f^ri nrnhpin o m 

ID NO: 4510. 


324 


• 100 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1363 


53 


1242 


AZ.035602 


Arabidopsis 
thaliana 


putative protein 


499 


28 


1243 


X764 83 


Gallus 
gallus 


Yes -associated protein 
(65kDa) 


574 


48 


1244 


AF220186 


Homo sapiens 


uncharacterized hypothalamus 
protein HT012 


503 


100 


1245 


AL021453 


Homo sapiens 


dJ821D11.3 (PUTATIVE protein) 


856 


100 


124 6 


AJ27^003 


Homo sapiens 




1216 


100 


1247 


Y57910 


Homo sapiens 


Human transmembrane protein 
HTMPN-34. 


1369 


98 


124B 


AC004874 


Homo sapiens 


similar to N- 

acetylgalactosaminyltransfera 
ee; similar to Q07537 
(PID:gll71989) 


957 


100 


1249 


AF199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein l 


1139 


100 


1250 


Y13148 


Rattus 
norvegicus 


PAG608 


1350 


88 


1251 


M24852 


Rattus 
norvegicus 


neuron -specific protein PEP- 
19 


124 


46 
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TABLE 2 



SEQ 
ID | 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1252 


AF146738 


Rattus 
norvegicus 


testis specific protein 


771 


83 


1253 


G02725 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 68C6 . 


419 


97 


1254 


W44375 


Homo sapiens 


Human ubiquitin-conjugating 
enzyme polypeptide. 


104 5 


Q Q 


1255 


AC006538 


Homo sapiens 


BC41195_1 


831 


T Q 
/ O 


1256 


AB004316 


Bos taunia 


mitochondrial methionyl-tRNA 
trans formylase 


1556 


88 


1257 


Z35094 


Homo sapiens 


SURF- 2 


1354 


97 


1258 


Y13362 


Homo sapiens 


Amino acid sequence of 
protein PR0214 . 


2383 


100 


1259 


AC006014 


Homo sapiens 


similar to RFP transforming 
protein; similar to P14373 
(PID:gl32517) 


1299 


100 


1260 


AC005099 


Homo sapiens 


match to AI222572 
(NID:g3804775) 


469 


100 


1261 


V00507 


Homo sapiens 


coding sequence of DHFR {1 is 
1st base in codon) (561 is 
3rd base in codon) 


984 


100 


1262 


X15443 


Rattus sp. 


gamma -glutamyl transpeptidase 
(AA 1-568) 


697 


32 


12 63 


AF173871 


Mus musculus 


neuronal PAS 3 


977 


* 94 


1264 


AF178983 


Homo sapiens 


Ras-associated protein Rapl 


433 


97 


1265 


Y70473 


Homo sapiens 


Human cyclic nucleotide - 
associated protein- 1 (CNAP- 
1) . 


2785 


99 


1266 


Y41738 


Homo 
sapiens 


Human PROS41 protein 
sequence . 


1622 


1C0 


1267 


AF061346 


Mua muoculus 


Edpl protein 


1077 


64 


1268 


U97006 


Caenorhabdit 
is elegans 


C13F10.4 gene product 


154 


23 


1269 


AF233 582 


Mu3 musculus 


GTPase Rat>37 


942 


95 


1270 


AF195951 


Homo sapiens 


signal recognition particle 
68 


3127 


98 


1271 


AL031177 


Homo sapiens 


dJ889M15.3 (novel protein) 


1150 


55 


1272 


AF201933 


Homo sapiens 


DC11 


650 


100 


1273 


AF201933 


Homo sapiens 


DC11 


346 


98 


1274 


AI,02171O 


Arabidops is 
thai i ana 


putative protein 


348 


49 


1275 


AC004449 


Homo sapiens 


R33683 3 


556 


100 


1276 


Y8629S 


Homo sapiens 


Human secreted protein 
HL2AGB7 , SEQ ID NO: 210. 


1920 


100 


1277 


Y71111 


Homo sapiens 


Human Hydrolase protein- 9 
(HYDRL-9) . 


1576 


99 


1278 


S94421 


Homo sapiens 


T cell receptor eta-exon 


478 


100 


1279 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


100 


1280 


AF161380 


Homo sapiens 


HSPC262 


772 


100 


1281 


Y48610 


Homo sapiens 


Human breast tumour- 
associated protein 71. 


779 


100 


1282 


AC015446 


Arabidcpsis 
thaliana 


Similar to AIG1 protein 


406 


35 


1283 


AK024432 


Komo sapiens 


FLJ00022 protein 


403 


35 


1284 


W96153 


Komo sapiens 


Human FADD- interacting 
protein \ c jut / . 


1825 


81 


1285 


AJ0Q1019 


Homo sapiens 


ring finger protein 


1301 


100 


1286 


AE0C3823 


Drosophila 
melanogaster 


CG13178 gene product 


195 


29 


1287 


AF178632 


Homo sapiens 


FEM-l-like death receptor 
binding protein 


3261 


100 


1288 


AC006033 


Homo 
sapiens 


similar to MU* 64; similar to 
138027 (PlD:g2135214) 


1195 


100 


1289 


AC006033 


Homo 
sapiens 


similar to MLN 64; similar to 
138027 (PlD:g213S214) 


668 


93 


1290 


AB023811 


Homo sapiens 


TU3A 


351 


54 
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ID 
NO: 


ACCESSION 
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SPECIES 


DESCRIPTION 


SMITH- 
SCORE 


% 

IDENTITY 


1291 


Z73424 


Caenorhabdi t 
is elegans 


C44B9.1 


235 


36 


1292 


Y94871 


Homo 
sapiens 


Human protein clone HP02551, 


1222 


100 


1293 


AF190425 


Homo sapiens 


retinoblastoma-associated 
protein RAP140 


489 


29 


1294 
1295 


G03856 
AF133670 


Homo sapiens 
Mus musculus 


Human secreted protein, SEQ 
Lu WO: /937 . 

ARL-6 interacting protein- 2 


538 
367 


99 
51 


1296 
1297 


ACT249735 
X57560 


Homo sapiens 

Escherichia 

coli 


claudin-6 
pspE protein 


1142 
535 


100 
100 


1298 


AF169284 


Homo sapiens 


IiIM and cysteine -rich domains 
protein 1 


1997 


100 


1299 
1300 


U41023 
AB024523 


Caenorhabdi t 
is elegans 

Homo sapiens 


coded for by C. elegans cDNA 
yk61fl.3; coded for by C. 
yki09h8.5 

basic kruppel like factor 


324 


29 


1301 
1302 


X55989 
AF007151 


Homo sapiens 
Homo sapiens 


eosinophil cationic-related 

protein 

unknown 


1206 
737 


100 
99 


1303 


X52904 


Escherichia 
coli 


open reading frame (AA 1-65) 


1481 
359 " " 


100 
100 


1304 
1305 


U19577 
AF266508 


Escherichia 
coli 

Mus musculus 


galactonate dehydratase 
NELF protein 


242 


93 


1306 


Y57901 


Homo sapiens 


Human transmembrane protein 
HTMPN-2S. 


932 


97 
100 


1307 


UB8750 

Tl 1? n A A n^j* 


Caenorhabdi t 
±o elegans 


similar to the mitochondrial 
carrier family 


365 


54 


1308 
1309 


/ir U 44 A/4 
ALQ78593 


Homo sapiens 
Homo sapiens 


breakpoint cluster region 
protein 2 

dJ210Bl . 1 { KIAA0680) 


2681 
267 


99 
34 


1310 
1311 


X82693 
Z82263 


Homo sapiens 
Caenorhabdi t 
i s elegans 


E4 8 antigen 

C47A4 .1 ' 


620 
283 


96 
35 


1312 


AF131218 


Homo sapiens 


enromosome 16 open reading 
frame 5 


14 93 


100 


1313 
1314 


Y41763 
AF196972 


Homo 
sapiens 
Homo sapiens 


Human PR093 8 protein 
sequence . 
JM24 protein 


1636 


1D0 


1315 




Homo sapiens 


insulin receptor substrate 
like protein 


2239 
228 


100 
97 


1316 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


100 


1317 


AF153127 


Gallus 

gallus ] 


SAPK interacting protein 


2442 


89 


1318 
1319 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1477 


83 


1320 


AF153127 
X56932 


Gallus 
gallus 

Homo sapiens 


SAPK interacting protein 


1651 


86 


1321 
1322 


AF174605 


Homo 
sapiens] 
>Y83086 
Y83086 09- 
MAR- 2000 28- 
AUG-1998 F- 
box protein 
FBP-18. 
fHomo 
sapiens 


F-box protein Fbx25 j 


104 4 
467 


100 
70 




M61732 


Trypanosoma 
cruzi 


neuraminidase 


214 " 


24 


1323 


Y17013 


porcine 
endogenous 


pol 


304 


64 
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DESCRIPTION 


SMITH- 
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SCORE 


% 

IDENTITY 


1324 


&L138655 


retrovirus 

Arabidopsis 

thaliana 


putative protein 


1174 


37 


1325 


AL138655 


Arabidopsis 
thaliana 


putative protein 


946 


35 


1326 


AL133215 


Homo sapiens 


bA108L7.2 (novel protein 
similar to rat tricarboxylate 
carrier) 


1322 


99 **- 


1327 


AF161541 


Homo sapiens 


HSPC056 


1357 


99 


1328 
1329 


Y73346 
L10910 


Homo sapiens 
Homo sapiens 


HTRM clone 619699 protein 
sequence . 
splicing factor 


785 

912 


96 
82 


1330 
1331 


AF146568 
W87772 


Homo sapiens 
Homo sapiens 


MIL1 protein 

Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


1936 
232 


100 
39 


1332 

^= 


Y41741 


Homo 
sapiens 


Human PRO704 protein 
sequence . 

zinc- finger protein ZBRK1 


1860 
411 


1C0 
91 


13 3 3 
1334 


Z82271 


Homo s £p i e ns 
Ca enorhabdi t 
is elegsns 


Similarity to Mouse kinensin- 
like protein KIF4 comes from 
this gene 


578 


44 


13 35 


r\tl> UUuOlU 


Me t hanofoa c t e 
r i um 

thermoautotr 
ophicura 


conserved protein 


290 


43 


13 36 


Y68779 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-11. 


1019 


91 


1337 


AB027003 


Mus mus cuius 


protein phosphatase 


378 


84 


1338 


U64856 


Caenorhabdit 
is elegans 


weak similarity to TPR 
domains 


215 


40 


1339 


AE001394 


Plasmodium 
falciparum 


protein of the YMR7 family 


170 


29 


1340 


X76717 


Homo sapiens 


MT-11 protein 


2 04 


89 


1341 


AC011914 


Arabidopsis 
thai iana 


putative mutT protein; 6 839B- 

67B81 


289 


45 


1342 


AJ276171 


Homo sapiens 


ASPIC 


2122 


100 


13 43 




Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


1344 


AC006963 


Homo sapiens 


similar to Kelch proteins ; 
similar to BAA77027 
(PIDrg46S0844) 


8 94 


35 


1345 


AF2S7466 


Homo sapiens 


N-acetylneurarainic acid 
phosphate synthase 


1880 


99 


1346 


Y25896 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
64. 


1148 


100 


1347 


AJ272073 


Torpedo 
marmorata 


male sterility protein 2 -like 
protein 


1664 


58 


1348 


AF161S48 


Homo sapiens 


HSPC063 


1018 


98 


1349 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96 . 


j 1117 


100 


1351 


G02144 


Homo sapiens 


~ Human secreted protein, SEQ 

ID NO: bzZt» - 


418 


100 


1352 
1353 


D90869 
A12029 


Escherichia 
coli 

Homo sapiens 


similar to 
MRP- 14 


2047 
613 


100 
100 


1354 
1355 

1356 
1359 
1360 
1361 


AC005328 
AC024876 

AF077226 
AF217188 
AC074331 
AL163279 


Homo sapiens 
Caenorhabdit 
is elegans 
Homo sapiens 
Mus musculus 
Homo sapiens 
Homo sapiens 


R26660 1, partial CDS 

contains similarity to 

SW;RPB1 CRIGR 

copine III 

YIP1B 

ZNF234 

homolog to cAMP response 


870 
829 

1876 
801 
3869 
5035 


74 
61 

64 
63 
100 
99 
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IDENTITY 








element binding and beta 
transducin family proteins 






1362 


Z48475 


Homo sapiens 


glucokinase regulator 


3160 


99 


1363 


Z4 84 75 


Homo sapiens 


glucokinase regulator 


2682 


97 


1364 




Homo sapiens 


megakaryocyte- enhanced gene 
transcript 1 protein; MEGTl 
protein 


2055 


' 99 


1365 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1366 


AF116609 


Homo sapiens 


FR00915 


581 


100 


13 67 


AL1 173 52 




Homo sapiens 


dJ876B10.3 (novel protein 
similar to C. elegans 
T19B10.6 (Tr:Q22557)) 


2581 


99 


136 8 


Y34 124 


Homo 
sapiens 


Human potassium channel 
K+Hnovl5 . 


1342 


100 




AJ24 5621 


Homo sapiens 


CTL2 protein 


3728 


99 


1370 


AJ? 0 00 2 2 0 


Bacillus 
subtilis 


YtaG 


429 


45 


AJ /I 


X05562 


Homo sapiens 


alpha- 2 chain precursor (AA - 
25 to 1018) (3416 is 2nd base 
in codon ) 


5908 


99 


13 72 


Z98048 


Homo sapiens 


CJJ4 0 8N23.4 (novel DnaJ domain 
protein) 


1296 


99 


1373 


AF154415 


Homo sapiens 


FLASH 


10253 


100 


i-3 fx 


U20286 


Rattus 
norvegicus 


lamina associated polypeptide 
1C 


156 7 


69 


1.3 / I> 


US 3 4 45 


Homo sapiens 


DOCl 


1645 


46 


13 76 


AL117337 


Homo 
sapiens 


bA3 93J16.1 (zinc finger 
protein 33a (KOX 31)) 


250 


60 


13 77 


AC005328 


Homo sapiens 


R2666 0_l, partial CDS 


1126 


100 


XJ /O 


U3 5113 


Homo sapiens 


metastasis -associated gene 


1823 


69 




1*153 13 


Caenorhabdi t 
is elegans 


putative 


858 


58 


1380 


Y25756 


Homo sapiens 


Human secreted protein 
encoded from gene 46. 


1508 


100 


13 81 


AB037360 


Homo sapiens 


ANKHZN 


5734 


95 


1382 


AB03 73 60 


Homo sapiens 


ANKHZN 


959 


97 


i ipi 

1 J OJ 


AF237676 


Mus musculus 


G beta- like protein GBL 


1721 


96 


13 84 


AF237676 


Mus musculus 


G beta- like protein GBL 


1043 


70 


J. J OD 


Y58793 


Homo sapiens 


Human calcium regulatory 
protein CaREG-i. 


715 


100 


13 86 




Homo sapiens 


nine in 7 


10369 


99 


1387 


AL031685 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


13 88 




Homo sapiens 


similar to zinc finger 
proteins; similar to BAA243B0 
>W06316 W06316 03-OCT-1996 
2 7 -APR- 1995 TRP-1 protein. 


542 


86 


1389 


AF187989 


Homo sapiens 


zinc finger protein ZNF223 


2665 


99 


1390 - 


AC03 51S0 


Homo sapiens 


Zinc finger protein ZNF221 


3459 


100 


1391 




Homo sapiens 


PIST 


1410 


97 


1392 


AF282265 


Homo sapiens 


inner centromere protein 

INCENP 


1794 


99 


1393 


X90840 


Homo sapiens 


axonal transporter of 
synaptic vesicles 


4584 


99 


1394 


A*r u / o £. *± zf 


Homo sapiens 


zinc ringer protein SBBIZ1 j 


3208 


99 


13 95 


G02224 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6305 . 


299 


75 


1396 


AC004809 


Arab i dop sis 
thaliana 


Similar to 


130 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


66 


1399 


AL133396 


Homo 
sapiens 


dJl068H6.4 (prion protein 
like protein doppel) 


962 


100 


14 00 


Y4B611 


Homo sapiens 


Human breast tumour- 
associated protein 72. 


817 


99 


1401 


AC004472 


Homo sapiens 


PI. 11659 5 


280 


54 


1402 


X91489 


Saccharomyce 
s cerevisiae 


putative HMG box 


164 


27 
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DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1403 


Y79222 


Homo 
sapiens 


Human transferase TRNSFS-14. 


2842 


100 


14 04 


X81058 


Mus musculus 


tex261 


1010 


99 


1405 


AB012084 


Mus musculus 


ITM 


194 


29 


1406 


AB030251 


Homo sapiens 


GTPase activating protein 


3233 


99 


1407 


AJ010585 


Rattus 
rattus 


PTB-like protein 


2684 


99 


1408 


X75760 


Drosophila 
melanogaster 


LRR47 


364 


29 


1409 


U76618 


Mus musculus 


N-RAP 


804 


48 


1410 


AC005578 


Homo sapiens 


F20B87_1, partial CDS 


835 


63 


1411 


AE000284 


Escherichia 
coli 


orf, hypothetical protein. 


360 


100 


1412 


X01563 


Escherichia 
coli 


L5 (rplE) laa 1-179) 


911 


100 


1413 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


AB031051 


Homo sapiens 


organic anion transporter 
OATP-E 


3832 


100 


1415 


M17466 


Homo sapiens 


coagulation factor XII 


3455 


100 


1416 


AF097994 


Homo 
sapiens 


I*- kynurenine/ alpha - 
aminoadipate aminotransferase 


2202 


99 


1417 


AF151077 


Homo sapiens 


HSPC243 


1262 


99 


1418 


Y09945 


Rattus 
norvegicus 


putative integral membrane 
transport protein 


1098 


61 


1419 


U13152 


Mesocricetus 
auratus 


guanine nucleotide-binding 
protein beta 5 


2179 


76 


1420 


AL162458 


Homo sapiens 


bA465L10.5 (KIAA1176 (novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2 ) ) 


5696 


100 | 


1421 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 308. 


152 


29 


1422 


Y94923 


Homo sapiens 


Human secreted protein clone 
qsl4_3 protein sequence SEQ 
ID NO: 52. 


4039 


99 


1423 


AF177388 


Homo 
sapiens 


cancer-amplified 
transcriptional coactivator 
ASC-2 


10748 


99 


1424 


Y4 8517 


Homo sapiens 


Human breast tumour - 
associated protein 62. 


1851 


99 


1425 


AF208848 


Homo sapiens 


BM-006 


1454 


89 


1426 


AF208848 


Homo sapiens 


BM-006 


853 


79 


1427 


AF1128B6 


3os taurus 


differentiation enhancing 
factor 1 


4693 


95 


1426 


U41387 


Homo sapiens 


Gu protein 


13 72 


63 


1429 


AF161534 


Homo sapiens 


HSPC049 


2853 


78 


1430 


AF125043 


Mus musculus 


bisphosphate 3 ' -nucleotidase 


275 


30 


1431 


Y66718 


Homo 
sapiens 


Membrane- bound protein 
PR01106. 


1886 


100 


1432 


AF193613 


Homo sapiens 


cell recognition molecule 
Caspr2 


568 


100 


1433 


AB044560 


Mus musculus 


Gliacolin 


192 


34 


1434 


R99900 


Homo sapiens 


NTII-1 nerve protein, 
facilitates regeneration of 
nerve cells. 


707 


51 


1435 


AF220530 


Homo sapiens 


myo- inositol 1-phosphate 
synthase Al 


2904 


100 


1436 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


1261 


72 


1437 


AF271732 


Homo sapiens 


bridging integrator- 3 


1282 


100 


1438 


Y30811 


Homo sapiens 


Human secreted protein 
encoded from gene 1 . 


595 


98 


1439 


AJ293659 


Homo sapiens 


mucolipidin 


628 


97 


1440 


AF219138 


Homo sapiens 


GGA3 long isoform 


3083 


100 


1441 


AF219138 


Homo sapiens 


GGA3 long isoform 


3346 


100 i 
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SMITH- 
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SCORE 


% 

IDENTITY 


1442 


AB039669 


Homo sspiGns 




1944 


100 


1443 


AF237711 


Drosophila 
mel anogas ter 


Diablo 


191 


27 


1444 


AJ011096 


Homo sapiens 




4 3 9 


39 


1445 


X73874 


Homo sapiens 


±jiiKJz>yii\_>i. y j. clot; Kinase 


6233 


98 


1446 


AF214114 


Homo sapiens 


*** i_ — LlUIIIa'-aSSOClal.cu 

antigen BCAA 


3999 


99 


1447 


AF003924 


Homo sapiens 


ANC 2H01 


2645 


99 


1448 


AF003136 


Caenorhabdi t 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2843 


52 


1449 


AF155112 


Homo sapiens 


i — kcw — 3 li aiiLi ge n 


1184 


89 


1450 


Y95004 


Homo sapxGns 


Human secreted protein 
vcz>* i , shy XL/ NO : 4 8 . 


S85 


100 


1451 


AF107203 


Homo sapiens 


ataxin 2-binding protein 


688 


57 


1452 




Homo sapiens 


ataxin 2-binding protein 


456 


78 


1453 


Z38011 


i 'us mubcuius 


UWK- NSf 


882 


56 


1454 


\J z> o o 


Homo sapiens 


Protein sequence and 
annotation available soon via 
LABEIT©EMBL-Heidelberg . DE 


510 


28 


1455 


AL035409 


Homo sapiens 


dJ564Mli.3 (similar to 
sialyltranf erase) 


1356 


100 


14 56 


riA a a an 


Mus musculus 


MATH- 2 protein 


272 


100 


1458 


AF141326 


Homo sapiens 


RNA helicase HDB/DICE1 


478 


45 


1459 


AF242552 


Gallus 
gal lus 


retinovin 


945 


34 


1460 


U11036 


Homo sapiens 


Ibdl 


724 


84 


1461 


AB025258 


Mus musculus 


granuphilin-a 


545 


39 


1462 


Y08134 


Homo sapiena 


acid sphingomyelinase- like 
phosphodiesterase 


2428 


99 


1463 


AC004997 


Homo sapiens 


match CO ESTs 243979 
(NID:g573097j , R19699 
(NID:g774333J 


869 


98 


1464 


AC004997 


Homo sapiens 


match to ESTs 243979 
<NID:g573 097) , R19699 
(NID:g774333) 


869 


98 


^ acq 

HDD 


U32743 


Haemophilus 
influenzae . 
Rd 


rucose operon protein (fucU) 


315 


50 


1456 


Y09022 


Homo sapiens 


Not56-like protein 


2342 1 


100 


1467 




Homo sapiens 


Homolog of rat kidney- 
specific {KS) gene 


1072 


99 


1468 


AF071544 


Spinacia 
oleracea 
( 


ribulose-i, 5-bisphosphate 
carboxylase/oxygenase small 
subunit N-methyl transferase I 


333 


26 


1469 


Y57930 


Homo sapiens 


Human transmembrane protein 
HTMPN-S4. 


1053 


100 


1470 


AF03 2666 


Rat tus 
norvegicus 


rsec5 


4504 


93 


1471 


Y70467 


iiuuiu d _l trz 1 i ir> 


Human membrane channel 
protein- 17 (MECHP-17) . 


452 


74 


1472 


AL031033 


Homo sapiens 


C321D2.1 (Ribosomal Large 
Subunit Pseudouridine 
Synthase protein) 


1694 


100 


1473 


AF177292 




genethonin 3 


4026 


98 


1474 


S45936 


Homo sapiens 


HTS1 


1101 i 


50 


1475 


Y86241 


Homo sapiens 


Human secreted protein 
HOABR60, SEQ ID NO: 155. 


1 B79 


?o 


1476 


AJ010317 


Fugu 

rubripes 


Sand 


1278 


68 


1477 


U42831 


Caenorhabdi t 
is elegans 


coded. for by C. elegans cDNA 
yk99b4.3; similar to human 
transforming protein 
(PIR:S22157> 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 


1479 


X82209 r 


Homo sapiens 


mnT 


7116 


100 


1480 


U10S36 


Pan paniscus 


MHC. class I A 


675 


04 
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X4 8X 


AT f 1 »7 Q c Q Q 


Homo sapiens 


dJ9SlC6 . 1 (novel protein 
similar to C. elegans 
F55A12 . 9 (Tr:P91086)) 


1274 


65 


14 82 


298977 


Sch.izosa.ccha 

romyces 

porobe 


putative vacuolar protein 


256 


29 


1483 


JllJ \J \j j o o ^ 


Mus mu.s cuius 


JNK/SAPK- associated protein- 1 


4968 


92 


1484 


AL050120 


Homo sapiens 


hypothetical protein 


716 


100 


14 8 5 


l v ij£ / O f O 


Homo sapiens 


DNA binding protein 


1006 


53 


1486 


Y69161 


Homo sapiens 


Amino acid sequence of a 
partial protein kinase. 


575 


99 


1487 


X84156 


Saccharomyce 
s cerevisiae 


ATH1 


341 


29 


1488 


AF038963 


Homo sapiens 


RNA helicase 


446 


34 


1489 


U56966 


Caenorhabdi t 
is elegans 


coded for by C. elegans cDNA 
y:<30b3.5; coded for by C. 
elegans cDNA yk30b3.3 


620 


42 


1490 


AE000989 


Archaeoglobu 
s fulgidus 


enoyl-CoA hydrataoe (fad-4) 


533 


46 


14 91 


M80633 


Rattus 
norvegicus 


adenylyl cyclase type IV 


707 


95 


14 92 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein 
sequence . 


3 513 


99 


1493 


Y17220 


Homo sapiens 


Human secreted protein (clone 
f j283-ll) . 


452 


37 


1494 


AF133670 


Mus musculus 


ARL-6 interacting protein-2 


701 


97 


1495 


Y94897 


Homo 
sapiens 


Human protein clone HP10574. 


1371 


100 


14 96 


AL049699 


Homo sapiens 


dJ747H23.2 (novel protein) 


1550 


100 


1497 


AF03 7447 


Homo sapiens 


ribosomal S6 protein kinase 


2427 


100 


1498 


AL445067 


The rmop las ma 
acidophilum 


putative target YPIi207w of 
the HAP 2 transcriptional 
complex related protein 


269 


35 


14 99 


AB03 9947 


Homo sapiens 


XllL-binding protein 51 


227 


36 


1500 


AJ277750 


Homo sapiens 


UBASH3A protein 


3509 


100 


1501 


AL.050333 


Homo 
sapiens 


d\J93K22.1 (novel protein 
(contains DKFZP564B116) ) 


2439 


100 


1502 


AF17989S 


Homo sapiens 


TALE homeobox protein Me is 2b 


1140 


100 


1503 


AF178948 


Homo sapiens 


TALE homeobox protein Meis2a 


1177 


100 


1504 


Y53005 


Homo sapiens 


Human secreted protein clone 
pn749_8 protein sequence SEQ 
ID NO: 16. 


1442 


99 


1505 


X82494 


Homo sapiens 


f ibulin-2 


3580 


99 


1506 


X98296 


Homo sapiens 


ubiquitin hydrolase 


783 


42 


1507 


ALiU J454 8 


Homo sapiens 


dJ1103G7.6 (novel protein) 


1098 


100 


1508 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1736 


100 


1509 


AF220182 


Homo sapiens 


uncharacterized hypothalamus 
protein HT008 


1181 


98 


1510 


U64601 


Caenorhabdi t 
is elegans 


Gene probably begins in the 
next cosmid 


415 


58 


1511 


AL356192 


Neurospora 
crassa 


related to MDM1 protein 


196 


29 


1512 


D17629 


Homo 
sapiens 


N-acetylgalactosaraine 6- 
sulfate sulfatase (GALNS) 


1829 


100 


1513 


AF168717 


Homo sapiens 




694 


99 


1514 


AJ243531 


Homo sapiens 


nM15 protein 


735 


100 


1515 


AC003672 


Arabidopsis 
thai i ana 


putative C3HC4-type RING zinc 
finger protein 


407 


30 


1516 


AF115435 


Rattus 
norvegicus 


syntaxin 17 


1374 


90 


1517 


AF00314 0 


Caenorhabdi t 
is elegans 


C44E4.5 gene product 


274 


31 


1518 


AB002584 


Rattus 
norvegicus 


be ta - alanine -pyruvate 
aminotransferase 


2238 


82 


1519 


AL121764 


Schizoeaccha 


yeast atpl2 protein precursor 


270 


30 
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TABLE 2 



SEQ 
XD 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
SCORE 


IDENTITY 






romyces 
pombe 


homolog 






1520 
1521 


f\E Z> _> z} JL\J 

D31764 


Homo 
sapiens 
Homo sapiens 


vascular endothelial 
junction-associated molecule 
KIAA0064 


547 


100 


1522 


Y66634 


Homo 
sapiens 


Membrane -bound protein 
PRO190. 


170 
985 


27 
100 


1523 


Y94450 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 


1524 


AC0001O7 


Arabidopsis 
thaliana 


F17F8.22 


277 


37 


1525 
1526 


AF109377 
AL031427 


Mus musculus 
Homo sapiens 


ldlBp 

dJ167A19.4 (novel protein) 


1277 


83 


1527 
1528 


Y08135 
AK024423 


Mus musculus 
Homo sapiens 


acid sphingomyelinase- like 
phosphodiesterase 
FLJ*O0O12 protein 


1432 
1496 


99 
79 


1529 


AF154502 


Homo sapiens 


quiescent cell proline 
dipeptidase 


611 
679 


100 
100 




Ar 205598 


Homo sapiens 


transposase-like protein 


1368 


100 


1531 


AF251039 


Homo sapiens 


putative zinc finger protein 


1420 


50 


1532 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


4 93 


57 


1533 


AF039023 


Homo sapiens 


Ran-GTP bindina Droteirr 
RanBP6 


Z> IsJ I 


99 


1534 


AC00719O 


Arabidopsis 
thaliana 


F23N19.9 


374 


37 


1535 
1536 


AB027564 
Y36178 


Homo sapiens 
Homo sapiens 


DINB1 


4482 


100 


1537 


Y50S07 


Homo sapiens 


Human fetal brain cDNA clone 
vb3_l derived protein . 


3 77 
3 593 


B7 
99 


153 8 
1539 


AF017368 
AF266756 


Mus musculus 
Homo sapiens 


faciogenital dysplasia 
protein 2 

sphingo3ine kinase 


177 
2011 


47 
99 


1540 
1541 


Z48604 
AF000195 


Homo sapiens 
Caenornabdi t 
is elegans 


OA1 

Contains similarity to Pfam 
domain: PF00169 <PH) , 
Score=20.6, E-value=l . 9e-05, 
N=l 


2238 
379 


100 
42 


1542 

1543 
1544 


Y71159 

X76092 
AB015330 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human phosphodiesterase 
interacting protein, 
myomegalin. 

DNA binding protein RFX3 
HRIHFB20 07 


9415 
3327" 


99 
100 


1545 
1546 

1547 


AF198487 
AF016417 

X55885 


Homo sapiens 
Caenornabdi t 
is elegans 
Homo sapiens 


transcription factor LBP-lb 
Similar to BZIP transcription 
factor 

KDEL receptor 


631 
2 822 
518 


50 

100 

42 


154 8 
1549 j 


AL021707 


Carassius 
auratus 
Homo sapiens 


ubiquitin-activating enzyme 
El 

dJ508I15 .4 (KIAA0668) 


1106 
836 


100 
42 


1550 


AJ22397B 


Bacillus 
subtilis 


YvqK protein 


3688 

292 


100 
42 


1551 


AF145615 


Drosophila 
melanogaster 


BcDNA.GH033 77 


822 j 


44 


1552 
1553 


AL157734 
AF079S27 


Schizosaccha 

romyces 

porabe 

Mus musculus 


putative mannosyl transferase 
involved in N-glycosylation 

IER5 - 


435 

691 ! 


37 


1554 


AB026291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


1099 


63 
88 


1555 


Y44722 


Homo sapiens 


Human immune system molecule, 
ISMO-3. 


1780 


99 


1556 
1557 


AF116553 
Y71056 


Drosophila 
melanogaster 
Homo sapiens 


antennal- specific short -chain j 277 
dehydrogenase/reductase | 
Human membrane transport | 1975 


32 
99 
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ACCESSION 
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SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 








protein, MTRP-1 . 






1558 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-1 . 


1975 


99 


1559 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-1- 


1 894 


97 


1560 


AF092050 


Mus musculus 


beta- 1 , 3 ~N- 

ace tylglucosamznyl transferase 


262 


44 


1561 


AL109827 


Homo sapiens 


6\J3O9K20-2 (acrosomal protein 
ACR55 (similar to rat sperm 
antigen 4 (SPAG4 ) ) ) 


1607 


97 


1562 


AJ131890 


Homo sapiens 


DNA polymerase lambda 


3002 


100 


1563 


AL035424 


Homo sapiens 


dA22D12.1 (novel protein 
similar to Drosophila Kelch 
proteins) 


3015 


100 


1564 


AC002400 


Homo sapiens 


Gene product with similarity 
to Ubiquitin binding enzyme 


2790 


100 


1565 


AC005306 


Homo sapiens 


R27216 1 1 


919 


82 


1566 


AF000195 


Caenorhabdi t 
is elegans 


Contains similarity to Pfam 
domain: PF00169 (PH) , 
Score=20.6, E-value=l . 9e-05, 
N-l 


550 


45 


1567 


AB033281 


Homo 
sapiens 


F-box and WD- repeats protein 
beta-TRCP2 isoform C 


2879 


100 


15S8 


D4 9473 


Mus musculus 


truncated form of Soxl7 


1047 


78 


1559 


AK02527O 


Homo sapiens 


unnamed protein product 


210 


91 


1570 


X75756 


Homo sapiens 


protein kinase C mu 


4797 


99 


1571 


ABM 4 57 13 


Homo sapiens 


SCHIP-1 


2388 


100 


1572 


AE003831 


Drosophila 
melanogas ter 


CG18445 gene product 


180 


31 


1S73 


AF074603 


S t r e p t omyces 
griseus 
subsp . 
griseus 


NonF 


205 


38 


1574 


U28993 


Caeno rhabdit 
is elegans 


F22D3.3 gene product 


144 


27 


1575 


AF129507 


Homo sapiens 


transcription factor ICBP90 


287 


68 


1576 


X64878 


Homo sapiens 


oxytocin receptor 


2002 


100 


1577 


AF237711 


Drosophila 
melanogas ter 


Diablo 


421 


54 


1578 


G00975 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5056. 


480 


100 


1579 


AF248744 


Cryptosporid 
ium parvum 


thrombospondin- related 
adhesive protein 


123 


33 


1580 


AL121782 


Homo sapiens 


dJ585I14.2 (novel protein 
(translation of cDNA 
Em:AK000219) ) 


663 


100 


1581 


AF041853 • 


Homo sapiens 


kinesin family member protein 
KIF3A 


345 


*i "a 
4<) 


1582 


AF025441 


Homo sapiens 


Opa- interacting protein DIPS 


1198 


100 


1583 


AH001803 


The rmo t oga 
maritima 


glycerate kinase, putative 


349 




1584 


AF252283 


Homo sapiens 


Kelch- like 1 protein 


3 973 


i no 

JLUXJ 


1585 


AF169675 


Homo 
sapiens 


leucine- rich repeat 
transmembrane protein FLRT1 


3494 


99 


1586 


AF118274 


Homo sapiens 


DNb-5 


2628 


97 


158 7 


X794 4 0 






3167 


99 


1588 


X99802 


Homo sapiens 


ZYG homologue 


3966 


99 


1589 


AF169803 


Homo sapiens 


f lavohemoprotein b5+b5R 


2563 


100 


1590 


Y29861 


Homo sapiens 


Human secreted protein clone 
cb98 4. 


181 


47 


1591 


225535 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


" 7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 


AL139314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
SCORE 


% 

T r»TfcT»T» t mv 
-LJJfcJN 1 J. I X 






pombe 








1595 


W78324 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 81. 


1318 


98 


1596 


Y94906 


Homo sapiens 


Human secreted protein clone 
rb649_3 protein sequence SEQ 
ID NO: 18. 


2236 


98 


1597 


AF17460S 


Homo sapiens 


F-box protein Fbx2S 


1408 


99 


1598 


AB032254 ' 


Homo 
sapiens 


bromodomain adjacent to zinc 
finger domain 2A 


9676 


98 


1599 


X73114 


Homo sapiens 


slow MyBP-C 


5568 


95 


1600 


X82200 


Homo sapiens 


gpStafSO 


2305 


100 


1601 


Y00876 


Homo 
sapiens 


Human LAPH-1 protein 
sequence . 


1149 


98 


1602 


AJ223351 


Homo sapiens 


HIRA-interacting protein 3 


2821 


99 


1603 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


2268 


99 


1604 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


1601 


99 


.1605 


AF185576 


Mus mus cuius 


POZ/zinc finger transcription 
factor ODA-8 


343 5 




1606 


AF093744 


Homo sapiens 


unknown 


131 


100 


1607 


A12142 


synthetic 
construct 


I FN -pseudo -omega 2 


800 


~QQ 


1608 


Y57949 


Homo sapiens 


Human transmembrane protein 
HTMPN-73. 


1868 


1UU 


1609 


AF151044 


Homo sapiens 


HSPC210 


681 


97 


1610 


X15218 


Homo sapiens 


ski protein (AA 1 - 728) 


3 7 6 S 


100 


1611 


Y08200 


Homo sapiens 


rab geranylgeranyl 
transferase 




100 


1612 


AF220560 


Homo sapiens 


B/K protein 


4*± a o 


99 


1613 


AC004481 


Arabidopsis 
thaliana 


nodulin-like protein 


371 


26 


1614 


Y09S01 


Homo sapiens 


NADH-cy bochrome-b5 reductase 


1607 


100 


1615 


Y15521 


Homo sapiens 


start position 1 


-lien 


97 


1616 


AJ010750 


Rattus 
norvegicus 


Castration induced prostatic 
apoptosis related protein- l, 
(CIPAR-l) 


p on 


62 


1617 


XS8079 


Homo sapiens 


S100 alpha protein 


481 


100 


1618 


Y6^678 


Homo 
sapiens 


Membrane- bound protein 
PRO1009. 


967 


100 


1619 


AJ242973 


Homo sapiens 


peptide methionine sulfoxide 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 


AD- 014 protein ' 


288 


100 


1621 


AJ007509 


Homo sapiens 


ElB-55kDa-associated protein 


4646 


98 | 


1622 


X64177 i 


Homo sapiens 


metallothionein 


380 


100 | 


1623 


AE001045 


Archaeoglobu 
s fulgidus 


A. fulgidus predicted coding 
region AF0859 


240 


36 


1624 


Al>3 55013 


Schizosaccha 

romyces 

pombe 


mitochondrial carrier protein 


403 


34 


1625 


Y66746 


Homo 
sapiens 


Membrane -bound protein 
PR01198. 


1184 


100 


1626 


D90053 


Sus scrofa 


destrin 


863 


100 


1627 


V35954 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
203. 


756 


100 


1628 ■ 


AL031775 


Homo sapiens 


dJ3 0M3.2 (novel protein) 


470 


100 


1629 


AF132484 


Mus musculus 


unknown 


286 


68 


1630 


AF017096 


Drosophila 
melanogaster 


similar to C. elegans 
R10H10.6 and S. cerevisiae 
YD8419.03C 


493 


61 


1631 


X03077 


Homo sapiens 


lactate dehydrogenase -A 


1704 | 


10O 


1632 


AF151084 


Homo sapiens 


HSPC250 


763 


100 


1633 


AJ001874 


Homo sapiens 


o¥f 


255 


97 


1634 


AC012187 


Arabidopsis 
thaliana 


Contains weak similarity to 
GATA-6 DNA-binding protein 
gb|H36i35, gb|Z26200 come 
from this gene. 


143 


38 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATKRMAN 
SCORE 


% ! 
IDENTITY 


1635 


AF026246 


Homo sapiens 


HERV-E integrase 


411 


90 


1636 


Y50943 


Homo sapiens 


Human adult brain cDNA clone 
ve8_ 1 derived protein. 


1126 


95 


1637 


AF134593 


Homo sapiens 


L-pipecolic acid oxidase 


2068 


99 


1638 


AJ238247 


Mus musculus 


putative phosphatase subunit 


1948 


96 


1639 


Y94942 


Homo sapiens 


Human secreted protein clone 
yk25l_l protein sequence SEQ 
ID NO: 90. 


1320 


100 


1640 


AF235030 


Homo sapiens 


BM8S antigen 


766 


99 


1641 


AF233288 


Drosophila 
melanogaster 


WDS 


358 


26 


1642 


M19351 


Mus musculus 


immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y70452 


Homo sapiens 


Human membrane channel 
protein-2 (MECHP-2) . 


1352 


100 


1644 


AF176520 


Mus musculus 


WD repeat- containing F-box 
protein FBW5 


2676 


88 


1645 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42. 


1156 


100 


1646 


X67155 


Homo sapiens 


mitotic kinase-like protein-l 


4456 


99 


1647 


M63180 


Homo sapiens 


threonyl-tRNA synthetase 


1040 


61 


1648 


YS7342 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


1566 


93 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
ligand (clone 3TW) . 


4137 


100 


1650 


AC007136 


Homo sapiens 


Putative map kinase 
interacting kinase 


856 


99 


1651 


AB015346 


Homo sapiens 


EpslSR 


4464 


99 


1652 


AIil61576 


Arabidopsis 
thai i ana 


putative protein 


1341 


48 


1653 


AC005313 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


1654 


AL031428 


Homo sapiens 


dJ184J9.1 (KIAA0601 protein) 


3526 


100 


1655 


AL031428 


Homo sapiens 


dJl84J9.1 (KIAA0601 protein) 


3526 


100 


1656 


AB01791O 


Dictyosteliu 
m discoideum 


myoM 


297 


32 


1657 


Y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-5. 


2251 


99 


1658 


AF056191 


Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


Arabidopsis 
thaliana 


ubiqui tin-specif ic protease 


137 


35 


1660 


AL078627 


Schizosaccha 

romyces 

pombe 


actin-like protein; {2 actin 
domains) 


320 


34 


1662 


X52022 


Homo sapiens 


collagen type VI, alpha 3 
chain 


16274 


99 


1663 


AF3 00648 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


1811 


100 


1664 


AF214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


Z48613 


S a ccha romy c e 
s cerevisiae 


unknown 


138 


26 


1666 


AF177385 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein i3oform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191 1 


1581 


47 


1668 


S67513 


Borna 
disease 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 
isolate, 
Peptide, 370 


p40 


397 


43 
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SEQ 
ID 
NO: 


ACCESSION 
IN UMBER 


SPECIES 

aa 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1669 


299753 


Schi zosaccha 

rorayces 

porabe 


putatxve NOL»l - NOP 2 - sun family 
nucleolar protein 


569 


47 


1670 


G03130 


Homo sapiens 


Human secreted protein, SEQ 
ID NO; 7211. 


427 


97 


1671 
1672 


M96625 
AF1744 82 


Gallus 
gallus 

Homo sapiens 


cardiac muscle tensih 
polycomb 3 


1185 


54 


1673 
1674 


Y51846 . 
AF255334 


Homo sapiens 
Homo sapiens 


Human 18.1 homolog protein 

fragment . 

EXP35 


2005 
233 


99 

«6 27 


1675 


Y94 367 


Homo 
sapiens 


Human protein clone HP10563. 


152 
109 


29 
30 


1676 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2. 


3043 


99 


1677 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2 . 


1580 


91 


1678 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 




1679 
1680 


nf loJibJl 
AK024453 


Homo sapiens 
Homo sapiens 


dentin sialophosphoprotein 

precursor 

FLJ0 0045 protein 


170 


17 


1681 


AF019236 


Dictyosteliu 
in diecoideura 


TipD 


1349 
613 


100 
34 


1632 


AJ243459 


Leishmania 
major 


proteophosphoglycan 


153 


26 


16 83 
i684 


Z69369 
X9491Q 


Schi zosaccha 

romyces 

pombe 

Homo sapiens 


PUtative GTP-hii ndinrr r»vrit-<» ■» « 
ERp28 




46 


1685 

1686 
1587 


AF286475 
AF191298 


Takifugu 
rubripes 
Homo sapiens 
Homo sapiens 


retinitis pigmentosa GTPase 
regulator- like protein 
vacuolar sorting protein 3 5 
transcription factor 


196 
4087 


100 
19 

100 


1688 
1689 


AJ275986 
X07311 


Homo sapiens 

Drosophila 

melanogaster 


transcription factor 
heat shock protein 


2958 
1886 
138 


100 

88 

43 


1690 
1691 


AF2404S3 
ACT272078 


Rattus 
norvegicus 
Homo sapiens 


IiISl- interacting protein 
NUDE1 

APOBEC-1 stimulating protein 


1383 
1256 


83 
68 


1692 
1693 

1694 


AJ272079 
AF177942 

AF263539 


Homo sapiens 

Xenopus 

laevis 

Homo sapiens 


APOBEC-l stimulating protein 
katanin p60 


1336 
1664 


60 
66 


1695 
1696 


AF2226B9 
AK000193 


Homo 
sapiens 
Homo sapiens 


argmine N-methyl transferase 

protein arginine N- 

methyl transferase 1-variant 2 


1774 
1182 


100 
81 


1697 


AB041035 


Homo sapiens 


unnamed protein product 
kidney superoxide -producing 
NADPH oxidase 


1060 
3122 


100 
100 


1698 
1699 


AB041035 
AF025772 


Homo sapiens 
Homo sapiens 


kidney superoxide -producing 
NADPH oxidase " i 


2181 


100 


1700 

1701 
1702 


Y44676 

AK022407 
AB024574 


Homo sapiens 

Homo sapiens 
Homo sapiens 


C2H2 zinc finger protein 
Human ARF-Related Protein-1 
(HARP-l) . 

unnamed protein product 
GTP-bmding like protein 2 


4 8B 
938 

315 
1172 


54 " | 
97 

98 

100 i 


1703 
1704 
1705 

1706 


AF055078 
AF198092 
AK003S73 


Homo sapiens 
Kus musculus 
Drosophila 
melanogaster 


zinc finger protein 42 
RP42 

CG12474 gene product 


421 

1057 

161 


52 
77 
33 


1707 
1708 
1709 


AB036345 

Y55927 
U27121 
AL391710 


Drosophila 
melanogaster 
Homo sapiens 
Danio rerio 
Arabidopsia 


aquaporin 

Human STLK2 protein. 
G12 

putative protein 


164 

2146 

212 

505 


24 

100 

47 

50 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION . 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






thaliana 








1710 


B01311 


Homo sapiens 


Human PR0241 polypeptide. 


1649 


97 


1711 


U40750 


Mus musculus 


formin binding protein 30 


4561 


85 


1712 


AJ011118 


Mus musculus 


skeletal muscle and cardiac 
protein 


1490 


89 


1713 


AF2S5303 


Homo 
sapiens 


membrane-associated nucleic 
acid binding protein 


4416 


99 


1714 


AF255303 


Homo 
sapiens 


membrane -associated nucleic 
acid binding protein 


2960 


100 


1715 


U08227 


Rattus 
norvegicus 


Ras- related protein 


511 


51 


1716 


AF168795 


Rattus 
norvegicus 


schlaf en-4 


1129 


44 


1717 


AF196304 


Homo sapiens 


SUMO-1- specific protease 


5804 


99 


1718 


AL355737 


Homo sapiens 


HMG20A 


1782 


100 


1719 


AB029333 


Halocynthia 
roretzi 


HrPET-1 


1069 


46 


1720 


AF071317 


Mus musculus 


C0P9 complex subunit 7b 


1297 


97 


1721 


AJ272215 


Homo sapiens 


HEYL protein 


1681 


99 


1722 


G01982 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6063. 


718 


100 


1723 


AL032643 


Caenorhabdi t 
is elegans 


similar to Uncharacterized 
protein family UPF003 4, 


825 


41 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6 053. 


586 


92 


1725 


Y94441 


Homo 
sapiens 


Human Adipose Specific 
Protein 1. 


1231 


100 


1726 


AF25S443 


Homo sapiens 


CGI-201 protein 


4397 


99 


1727 


AF103426 


Homo sapiens 


HT004 protein 


1810 


99 


1728 


D10884 


Bos taurus 


neurocalcin 


1002 


99 


1729 


218529 


Gallus 
gallus 


tensin 


1411 


64 


1730 


Z73423 


Caenorhabdi t 
is elegans 


cDNA EST EMBL:Z14 908 comes 
from this gene-cDNA EST this 
gene 


233 


41 


1732 


AF090891 


Homo sapiens 


PRO0105 


470 


30 


1733 


AJ277724 


Homo sapiens 


histone deacetylase 8 


201S 


100 


1734 


G0405O 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8131. 


503 


95 


1735 


D45913 


Mus musculus 


leucine -rich-repeat protein 


3531 


94 


1736 


AF096709 


Drosophila 
virilis 


failed axon connections 
protein 


276 


32 


1737 


AF195120 


Homo sapiens 


dynactin p62 subunit 


2417 


99 


1738 


L15314 


Caenorhabdi t 
is elegans 


contains similarity to Pfam 
family PF01772 N-l 


206 


37 ! 


1739 


X54618 


Listeria 

monocytogene 

s 


phosphadidyl inositol specific 
phospholipase C 


134 


27 


1740 


AL031658 


Homo sapiens 


dJ310O13.4 (novel protein 
similar to predicted C. 
elegans an C. intestinalis 
proteins) 


123 


31 


1741 


Y35924 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
173. 


1013 


99 


1742 


AC0133 54 


Arabidopsis 
thai iana 


F15H18.15 


202 


32 


1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1932 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08 . 


1854 


61 


1745 


AF221098 


Homo 
sapiens 


Ral guanine nucleotide 
exchange factor RalGPSIA 


1224 


70 


1746 


Y99372 


Homo sapiens 


Human PRO1430 (UNQ736) amino 
acid sequence SEQ ID NO: 116. 


1332 


99 


1747 


Y94294 


Homo sapiens 


Human coenzyme A-utilising 


842 


100 
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TABLE 2 



SEQ 
ID 
NO: 

1748 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 
enzyme CoAEN-2. 


SMITH- 

WAlbKnAN 

SCORE 


IDENTITY 


1749 


AK02443 6 
AE0OO877 


Homo sapiens 
Methanobacte 
rium 

thermoautotr 
ophicum 


FI*J00026 protein 
conserved protein 


1619 
231 


100 

36 


1750 
1751 


AF101361 - 
Y15367 


Drosophila 
raelanogaster 
Homo sapiens 


Abnormal X segregation 

ZNF232 


193 
889 


jj 
100 


1752 
1753 

1754 


AF25103 8 
AC003093 

X69089 


Homo sapiens 
Homo sapiens 

Homo sapiens 


GAP -like protein 
OXYSTEROL- BINDING PROTEIN; 
45% similarity to P22059 
(PID:gl29308) 
165kD protein 


822 
352 

5703 


100 
57 

99 


1755 
1756 

1757 


A1j0497S5 
AL031393 


Homo sapiens 
Homo sapiens 


dJ622I»5.3 (novel protein) 

dJ/J3D15.1 (Zinc-finger 

protein) 


2765 


10 0 
100 


1758 
1759 


AB04 0672 

AL022238 
AF117653 


Homo sapiens 

Homo sapiens 
Homo sapiens 


UDP-GalNAc: polypeptide N~= 

acetylcralac t osaminvl t ran «s f e> 
se 

dJl042Klo.4 (novel protein) 

dOUble hnmPAhOY r-k a ■» e> 


2020 
776 


99 
43 


17£0 
1761 


Y12065 
AL049712 


Homo sapiens 
Homo sapiens 


^ uuiiicumjjt protein 
hNop56 

dJ686C3,2 (nucleolar protein 
hNop56) 


375 

2959 

2595 


54 
99 
99 


1762 
1763 


AC002394 


Homo 
sapiens 


>JC ** c= jfAvuuct wi.n similarity 
to dyne in beta subunit 


1542 


51 


1764 


AF169017 


Homo sapiens 


f ormiminotransf erase *" 
cyclodeaminase 


877 


100 




U91S41 


Homo sapiens 


human f ormiminotransf erase 
cyclodeaminase* (f ted) protein, 
car boxy -terminal end 


596 


100 


1765 
1766 


AB013365 


Bacillus 
halodurans 


' YlqF ~~ ~" 


350 


34 ~~ 


1767 


Y38421 


Homo sapiens 


Human secreted protein 
encoded by gene No. 36. 


145 


71 


1768 
1769 
1770 


AC009176 

AK000647 
AJ238982 
U73522 


Arabidcpsis 
thaliana 

Homo sapiens 
Homo sapiens 
Homo sapiens 


putative ribulose-l, 5- 
bi sphospha te 

carboxylase/oxygenase small 
subunit N-methyltransferase I 
unnamed protean product 
VNN3 protein 
AMSH 


216 

737 

2665 

1214 


27 

99 
99 
56 


1771 
1772 
1773 
1774 

1775 
1776 


U89435 
S70011 
AL.O35086 
Y99426 

AF110330 


Kus musculus 
Rat t us sp. 
Homo sapiens 
Homo sapiens 

Homo sapiens 


unknown 

tricarboxylate carrier 
dJ44A20.2 (novel protein) 
Human PRO1604 (UNQ7B5) amino 
acid sequence SEQ ID No r 308. 
glutammase 


829 
1604 
2036 
1057 

3146 


86 
95 
100 
99 

100 


1777 
1778 


AJ269529 
Z81579 

AY007239 


Homo sapiens 
Caenorhabdit 
is elegans 
Homo sapiens 


glycerol 3 -phosphate permease | 
cdna EST yic76JEl.5 comes from 
this gene 


2787 
232 


100 

31 


1779 
1780 


ALil096O8 
AF2542 60 


Schizosaccha 

romyces 

pombe 

Homo sapiens 


monooxygenase X 
oxyscerol -binding protein 
family 

tuftelin l 


1875 
644 


99 
38 


1781 
1782 


L07924 


mus musculus 


guanine nucleotide 
dissociation stimulator 


1729 
247 


100 
50 


1783 j 
1784 


AF295773 
AK024475 


Homo 
sapiens 
Homo sapiens 


ral guanine nucleotide 
dissociation stimulator 
FLJ00068 protein 


142 
4333 


49. 
100 


1785 
1786 


AK024475 
G03933 

S82637 J 


Homo sapiens 
Homo sapiens 

Homo sapiens 


FJLJ0OOS8 protein 

Human secreted protein, SEQ 

ID NO: 8014 . 

Eg lambda- like gene/beta- 


3 996 
570 

247 


93 
100 

100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 








glucuronidase exon 11 homolog 







TRADOCS: 1 4 1 6280. 1 (%CT40I LDOC) 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 


2 




Receptor tyrosine kinase 
class III proteins. 


BLO0240B 24.70 8.250e- 
12 157-1B1 


3 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PRO0109D 17.04 B.OBSe- 
13 358-381 


4 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 9.400e- 
10 1129-1146 BL00028 
16.07 1.2S7e-09 820- 
837 


5 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


6 


BL00023 


Type II tibronectin 
collagen -binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


7 


BL00023 


Type II tibronectin 
collagen- binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


8 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 8 . 920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


9 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5.119e- 
09 863-917 I 


10 


PR00464 


E- CLASS P450 GROUP II 
SIGNATURE 


PR00464D 17.40 6.182e- 
12 294-312 PR00464G 
12.41 4.231e-ll 377- 
393 


11 


PR00734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.296e- 
09 502-520 


12 


PF00023 


Ank repeat proteins. 


PF00023B 14.20 6.500e- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.848e- 
09 79-113 


15 


PR002 08 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 9.8o8e- 
10 517-535 PR00208A 
12.59 2.233e-09 520- 
538 


17 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I. 


PD00066 13.92 8.200e- 
14 282-295 PD00066 
13.92 9-400e-14 477- 
490 PD00066 13.92 
6.500e-13 505-518 
PD00066 13.92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
6.571e-12 421-434 


18 


BLQ0845 


CAP-Gly domain proteins. 


EL00845 16.43 2.200e- 
-<£ -> o 3 — o U 


20 


BL004 87 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 287-329 


21 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 S.737e- " 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 


BL0O107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


23 


BL0O1O7 


Protein kinases ATP- 
binding region proteins. 


BL0O107A 18.3 9 3.250e- 
26 302-333 


25 


BL00115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2.776e-21 953- 
983 BL00115Y 11.86 
8.000e-17 1604-1650 
BL00115M 19-19 8.130e- 
16 731-774 BL00115H 
14.34 9.392e-l6 463- 
496 BL00115A 15.44 
7.414e-15 43-82 
BLOOllSR 6.50 6.128e- 
14 983-1010 BL00115J 
16.71 S.289e-14 591- 
617 BL00115I 8.33 
4.336e-13 535-590 
BL00115L 12.25 5.939e- 
13 662-694 BL00115G 
11.65 6.01le-13 435- 
463 BL0011SK 15.03 
3.417e-10 617-659 
BL00115O 16.76 5 . 805e- 
10 863-913 BL00115P 
11.54 7.538e-10 913- 
953 BLOOllSS 18.24 
7.968e-10 1010-1052 
BL00115U 10.34 4.47Se- 
09 1242-1265 


26 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420A 20.42 4 . 109e- 
11 81-110 BL00420A 
20.42 8.820e-10 84-113 


27 


BL00050 


Ribosomal protein L23 
proteins . 


BI*00050A 23.71 9.250e- 
27 94-127 BL00050B 
14.81 8.125e-12 133- 
147 


28 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.089e- 
10 41-54 


29 


PFD0756 


Putative esterase. 


PF00756C 14-12 1.108e- 
09 486-516 


32 


BL00557 


FMN-dependent alpha- 
hydroxy acid 
dehydrogenases proteins . 


BL00557D 17.76 5.065e- 
37 274-316 BL00557A 
35.08 8.909e-29 24-73 
BL00557C 15.59 l.OOOe- 
28 227-257 BL00557B 
21.27 8.898e-22 130- 
169 


34 


PR00629 


SHC PHOSPHOTYROSINE 
INTERACTION DOMAIN 
SIGNATURE 


PR00629E 9.90 5.886e- 
35 299-328 PR0062SF 
10.95 8.364e-32 334- 
361 PR0O629B 13.66 
3.786e-27 224-247 
PR00629A 13.45 8.364e- 
21 206-222 PR00629C 
3.80 4.000e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 


35 


PDO1270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


36 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








PD01270D 24.66 3.700e- 
34 171-207 PD0127OC 
19.54 3.455e-30 137- 
166 


37 


BL00412 


Neuromodulin (GAP -43) 
proteins. 


BL00412C 10.28 9.24le- 
10 264-298 


38 


Lj JLJ IV \J "J ^ 


Neuromodulin (GAP -43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 


39 


BL00412 


Neuromodulin (GAP- 43) 
proteins. 


BL00412C 10.28 9.241e- 
10 264-298 


40 


DDfin ion 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR0038OB 12.64 7.366e- 
14 342-360 PR00380C 
13.18 6.927e-l3 375- 
394 PR00380D 9.93 
2.180e-12 429-451 
PR0038OA 14.18 5.154e- 
12 143-165 


44 


BL00345 


Ets- domain probeins. 


BL00345B 21.28 l.OOOe- 
40 239-290 BL00345A 
13.96 2.452e-l4 204- 
223 


A C 


BL0U34 5 


Ets- domain proteins. 


BL00345B 21.28 l.OOOe- 
40 215-266 BL00345A 
13.96 2.452e-14 180- 
199 


46 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.538e- 
26 172-202 DM01551C 
14.62 3.57le-17 232- 
252 DM01551B 8.84 
4 . 750e-ll 214-226 


47 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 9.328e- 
11 246-260 


48 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4 .231e- 
33 6-45 


50 


BL00972 


Ubiquitm carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7 . 7SOe- 
19 994-1019 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


51 


BL00972 


Ubiquitin carboxyl - 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 
BL00972C 16.48 7 . OOOe- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


52 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.063e- 
14 10-54 


53 


PRO 098 8 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 8.500e- 
17 20-38 PR00988F 
12.23 7.828e-l5 196- 
210 PR00988C 13.64 
o . lU<Je-14 104-120 
PR00988E 8.27 3.872e- 
11 174-186 PR00988D 
5.95 6.878e-10 160-171 
PR00988B 11.60 2.915e- 
09 57-69 


55 


PRO 0762 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PR00762D 
11.29 4.103e-l9 509- 
530 PR00762A 14.22 
9.333e-18 199-217 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00762F 15.12 3.100e- 
16 563-583 PR00762B 
12.12 6.063e-l6 230- 
250 PR0C762E 12.07 
2.286e-15 545-562 
PR00762G 14.13 6.276e- 
13 601-616 


56 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 8.800e- 
10 153-203 


58 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1080-1135 


59 


PF00791 


Domain present in ZO-l 
and Unc5-like netrin 
receptors. 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.76 9.018e- 
09 206-221 


68 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 680-693 


69 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.39Se- 
09 670-683 


70 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.714C- 
10 51-64 


72 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 5.304e- 
09 108-118 


73 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-166 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 6.116e- 
10 93-120 


76 


DM0 0471 


0 PROKARYOTIC DNA 
TOPOISOMERASE I. 


DM00471A 11.73 9.357e- 
13 53-66 DM00471B 
8.45 4.857e-12 70-81 


80 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSER INE . 


PD02876C 8.80 2.723e- 
13 223-236 PD02876D 
12.13 2.588e-12 334- 
351 


81 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSER INE . 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.5B8e-12 393- 
410 


83 


BL00708 


Prolyl endopeptidase 
family serine proteins. 


BL00708B 24.91 7.197e- 
12 570-601 


84 


PR00014 


F I BRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 
09 985-1004 


86 


PR00678 


PI 3 KINASE PS 5 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 1.379e- 
09 246-269 


89 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR0O320C 13.01 8.200e- 
09 264-279 PR00320B 
12.19 8.650e-09 264- 
279 ! 


93 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 2.588e- 
14 316-332 


95 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL001O7A 18.39 4 . OOOe- 
10 123-154 


96 


BLOO107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


97 


PR00081 


GLUCOSE / RIB ITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- 
13 134-146 PR00081A 
10.53 2.500e-12 54-72 


98 


.PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR0 0380A 14.18 S.SOOe- 
24 401-423 PR00380D 
S.93 7.188e-20 613-635 
PR00380B 12.64 7.517e- 
16 529-547 PR00380C 
13.18 2.756e-13 560- 
579 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


102 


PR00300 


ATP- DEPENDENT CLP 
PROTEASE ATP- BINDING 
SUBUNIT SIGNATURE 


PR00300A 9.56 7.545e- 


104 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


djli uui /y& JLz . d / b . V86e — 
18 298-314 BL00479A 
19.86 4.913e-16 155- 
178 BL00479A 19.86 

BL00479B 12.57 6.294c- 
12 181-197 


106 


BL01019 


ADP-ribosylation factors 
family proteins. 


12 43-83 


107 


DM01970 


0 kv ZK632.12 YDR313C 
END0SOMAL III. 


16 403-416 


108 


BL00191 


Cytochrome b5 family, 
heme -binding domain 
proteins . 


3LO0191K 17.38 4.951e- 
27 238-282 BL00191J 
xx.j/ b.44/e-l7 182- 

204 


109 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.938e- 
37 8-47 


HO 


BL01138 


Scorni nn <^h<~>>"t* t- -i n «^ 

proteins . 


BL01138A 10.96 8.297e- 
10 38-50 


113 


BL00107 


Protein kinases ATP- 
ijj - ii "-*-» , y /.eyioii proceins . 


BLO01O7A IB . 39 5.800e- 
23 156-187 BL00107B 
13.31 9.100e-14 225- 
241 


117 


BL00214 


Cytosolic fatty-acid 

bindittCl nrnh(»'i«o 


BLO0214B 26.51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.052e-ll 5-31 


118 


BL00107 


Protpl n iri daqpc htd- 

c X WkUXll /ViJlel&tZO nl FT — 

binding region proteins. 


BLO0107A 18.39 8.560e- 
13 36-67 


119 


PRO0529 


GONADOTROPH IN RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 7.506e- 
10 158-177 


120 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


12X 


PR00320 


*j rnuicj.B OEtXJA WXI — 4U 

REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


127 


BL00215 


1 1 l wtiuiiux i ax energy 
transfer proteins. 


BL00215A 15.82 7.158e- 
13 216-241 


128 


BL01032 


Protein phosphatase 2C 
proteins. 


BL01032C 6.14 3.195e- 
12 147-157 BL01032H 
11.25 5.680e-ll 318- 
33± BL01D32G 8.33 
8.932e-ll 282-296 

D1*U X U O A X X KJ . *± A Q , jO^C- 

09 379-389 


129 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


Df.fll "J n r\ t a -7 a rr cn/i« 

Dbuuiu X3 . 1 *k 6 . 894e- 
26 28-64 


13 0 


PRO 0990 


RIBOKINASE SIGNATURE 


tr r%.\j \j y 2? yj t> X A . j> 2. y . J>j^6- 

15 47-67 PR00990A 

16 21 ^ ^ftflff»-1it O rt y« O 

j • SUUC Jll 

PR00990C 12.62 2.412e- 
09 119-133 


133 


BLOO880 


Acyl -CoA-binding 
protein. 


BL00880 17.52 5.575e- 
26 72-122 


134 


BL0 0030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


135 


PR00215 


NEUROMODULIN SIGNATURE 


PR00215C 13.98 6.779e- 
10 475-496 


13 6 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 2.432e- 
29 71-107 


14 0 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.882e- 
14 214-231 BL00028 
16.07 9.471e-14 102- 
119 BL0O028 16.07 
2.800e-13 18-35 
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SEQ ID NO; 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








BL00028 16.07 5 . 500e- 
13 74-91 BL00028 
16.07 9.100e-13 186- 
203 BL00028 16.07 
8.043e-l2 46-63 
BL00028 16.07 8.435e- 
12 130-147 BL00028 
16.07 9-217e-12 270- 
287 BL00028 16.07 
6.192e-ll 242-259 
BL00028 16.07 4.000e- 
10 158-175 


141 


BL00501 


Signal peptidases I 
serine proteins . 


BLO0501D 16.69 9.53 8e- 
14 113-133 BL00501C 
9.61 8.688e-10 89-101 


143 


Bli01020 


SARI family proteins. 


BL01020C 15.35 7.722e- 
20 79-130 


146 


PD01066 


PROTEIN ZINC FINGER 
2 INC -FINGER METAL - 
BINDING NU. 


PD01066 19.43 6 . 400e- 
25 335-374 


149 


BL00126 


3* 5' -cyclic nucleotide 
phosphodiesterases 
proteins . 


BL00126C 22.07 1.450e- 
25 509-550 BL00126E 
35.22 3.95le-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 
BL00126B 15.20 8.200e- 
11 483-495 BL00126A 
27.56 8.269e-ll 442- 
479 


151 


BL00632 


Ribosomal protein S4 
proteins . 


BL00632 23.79 5 . 271e- 
20 106-149 


154 


BL00559 


Eukaryotic molybdopterin 
oxidoreductases 
proteins . 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 
199 BL00559J 19.63 
8.385e-13 99-151 
BL00559L 13.60 5.814e- 
12 241-259 


155 


PRO 04 4 9 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.692e- 
13 13-35 


157 


BL00406 


Actins proteins. 


BL00406D 12.58 2 . 547e- 
18 275-330 BL00406A 
9.95 5.776e-16 15-50 
BL00406B 5.47 7.429e- 
12 69-124 BL00406C 
6.75 9.682e-l2 128-183 


160 


BL00132 


Zinc carboxypeptidases, 
zinc -binding region l 
proteins . 


BL00132A 26.07 7.000e- 
14 22-63 BL00132C 
21.35 3.466e-12 104- 
145 


165 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 9.043e- 
13 139-158 


168 


BL00362 


Ribosomal protein SI 5 
proteins. 


BL00362 24.67 9.700e- 
15 129-172 


169 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 1.000c- 
35 640-686 BL00039A 
18.44 1.964e-13 212- 
251 BL00039B 19.19 
4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.721e- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / MATS 
family proteins. 


BL01310 14.74 2.432e- 
29 133-169 


179 


PDO1066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 


PD01066 19-43 9.455e- 
36 6-45 
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NO. 


DESCRIPTION 


RESULTS* 






BINDING NU. 




180 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


20 160-180 PR00007A 

160 PR00007C 15.60 
1.225e-15 206-228 
PR00007D 9.64 6.885e- 
11 238-249 


181 


BL00027 


1 Homeobox ■ domain 
proteins . 


BL00027 25.43 9.526e- 
24 280-323 


182 


BIiO0027 


• Home obox » doma i n 
proteins . 


BL00027 26.43 9.526e- 
24 263-306 


183 


BL00027 


• Hone obox • domain 
proteins. 


BL00027 26.43 9.526e- 
24 280-323 


184 


BL00027 


* Home obox 1 doma i n 
proteins - 


BL00027 26.43 9.526e- 
24 263-306 


188 


PRO 092 9 


AT -HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 

09 460-471 < 


189 


PR00929 


AT -HOOK- LIKE DCttAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 440-451 


190 


BLO0383 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383F 15.51 7. 188e- 
17 666-682 BL00383A 
13.34 8.714e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BL00383E 10.35 7.300e- 
14 628-639 BL00383F 
15.51 1.720e-13 371- 
387 BL00383C 10 . 10 
3.000e-13 217-228 
BL00383D 11.92 7.000e- 
13 295-308 BL00383B 
7.61 1.692e-ll 187-196 
BL00383C 10.10 1 . 750e- 
09 509-520 BL00383D 

II. 92 4.000e-09 589- 
602 BL00383B 7.61 
8.000e-09 479-488 


191 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 7.911e- 
lb 83-105 PROO450C 
12.22 6.286e-13 47-69 


193 


PF005^4 


Oc t i cos apep tide repeat 
proteins . 


PF00564B 24.74 6.164e- 
-Lb Z d. / — A / o 


194 


PR00503 


BROMODOMAIN SIGNATURE 


PR00503D 20.81 9.156e- 

"1 Q "yf\/l—~y}A oonnc mn 
-LD *Uf» — AZ,*± fKUUI>03x> 

9.96 9.571e-13 170-187 


195 


BL00901 


Cysteine 

synthase/cystathionine 
beta- synthase P- 
phosphate att. 


18 67-117 


197 


BL0063 6 


Nt-dnaO* domain proteins. 


BL00636A 8.07 6.211e- 
17 40-57 BL00636B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE 


PR00690A 10.86 9.866e- 
09 463-482 


199 


BL01131 


Ribosomal RNA adenine 
dimethyl ases proteins . 


BL01131A 26.62 2. 343e- 
12 84-130 


201 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.352e- 
12 509-522 


203 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.286e- 
10 39-72 


206 


PR00261 


LOW DENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4.462e- 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e- 
18 65-87 PR00261B j 
14.12 4.000e-18 143- 
165 PR00261A 11.02 
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NO. 


DESCRIPTION 


RESULTS* 








4.833e-18 143-165 
PRO0261D 12.47 7.500e- 
18 143-165 PR00261B 
14.12 5.065e-16 65-87 
PR00261C 11.37 8.967e- 
16 143-165 PR00261F 
11.57 4.938e-13 143- 
165 PR00261E 11.08 
7.188e-13 65-87 
PR00261F 11.57 7.188e- 
13 65-87 PR00261E 
11.08 1.643e-ll 143- 
165 




r% tt* a> r\ 

PFGQ791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 6.143e- 
13 118-173 PF00791C 
20.98 7.680e-10 132- 
171 




PR00 0 07 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007A 19.33 S.731e- 
19 131-158 PR00007B 
14.16 4.115e-18 158- 
178 PR00007C 15.60 
1.675e-15 201-223 
PR00007D 9.64 7.231e- 
11 233-244 


212 


BL00183 


Ubiqu it in -conjugating 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


213 


BL00183 


Uoiqui t in- con j uga t ing 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


215 


BL00039 


DEAD-box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 1.900e- 
29 568-614 BL00039A 
18.44 1.871e-23 21-60 
BL00039C 15.63 1.720e- 
11 364-388 BL00039B 
19.19 4.064e-ll 277- 
303 


4± / 


BLQ0100 


Chi or ampheni col 
acetyl transferase 
proteins. 


BL00100D 17.22 8.484e- 
09 68-106 




FKQOZlJ 


MYELIN P0 PROTEIN 
SIGNATURE 


FR00213C 15.94 3.969e- " 
11 199-227 


222 


BL00678 


Trp-Asp twqj repeat 
proteins proteins. 


BL00678 9.67 1.947e-09 
144-155 


224 


PR00875 


MOLLUSC METALLOTHIONEIN 
SIGNATURE 


PR00875A 5.83 l.OOOe- 
09 901-913 


"225 


BL00636 


Nt-dnaJ domain proteins. 


BL00636B 15.11 8.200e- 
19 18-39 


226 


BL0Q636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 1.000c- 
21 21-38 BL00636B 
15.11 8.200e-19 45-66 


229 


PK00301 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00301F 13.98 7.563e- 
13 329-346 PR00301G ' 
13.78 4.300e-12 361- 
382 


230 


BL00460 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 8.773e- 
20 35-70 BL00460B 
9.73 7.429e-16 78-96 
BL00460C 14.35 2.831e- 

J-*4 J.J.J. — JLj*± xJJjOU4 b (JjJ 

16.89 8.773e-ll 140- 
160 


231 


PR00647 


SENR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B 10.19 8.522e- 
09 273-287 


233 


BL00292 


Cyclxns proteins. 


BL00292B 20.31 7.429e- 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 


PRO 0449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 6.308e- 
13 7-29 PR00449C 
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NO. 
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RESULTS* 








17.27 4.462e-ll 47-70 
PR00449D 10.79 7.120e- 
11 109-123 


235 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e~ " 
10 251-265 PR00019B 
11.36 S.320e-09 119- 
133 PR00019B 11.36 
1.000e-08 229-243 


236 


PRO 0019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e- 
10 245-259 PR00019B 
11.36 5.320e-09 113- 
127 PR00019B 11.36 
i.OU0e-08 223-237 


237 


PD00289 


PROTEIN SH3 DOMAIN 
REPEAT PRF^YNA 


PD00289 9.97 8.448e-09 
67-81 


240 


PRO 0011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


24X 


PRO 0011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-63S 


244 


BL00903 


Cytidir.e and 
de oxy cy t i dy 1 ate 
deaminases zinc -binding 
region s. 


BL00903 12.93 8.941e- ! 
12 54-64 j 


245 




w KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 8.043e- 
09 124-134 


248 




Wnt-l family proteins. 


BL00246D 23.97 l.OOOe- 
40 186-239 BL00246E 
20.32 1.000e-40 SOS- 
SSI BL00246B 13.69 
4.176e-36 105-140 
BL00246A 15.75 2.286e- 
24 70-90 BL00246C 
15.56 4.857e-22 150- 
175 


250 


PR00927 


ADENINE NUCLEOTIDE 
TRANS LOCATOR 1 SIGNATURE 


PR00927E 14.93 5.114e- 
10 253-275 


"254 ; 


BL0 0674 


AAA-protein family 
proteins . 


BL00674B 4.46 l.OOOe- 
09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 6.045e- 
09 61-88 


255 




Src homology 3 (SH3 ) 
domain proteins profile. 


BL50002B 15.18 2.800e- 
10 421-435 


259 


PR0OO9A 


ADENYLATE KINASE 
SIGNATURE 


PR00094C 12.94 2.200e- 
18 87-104 PR00094D 
12.52 2.731e-14 161- 
177 PR00094A 10.31 
5.500e-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00094E 
11.25 7.333e-13 178- 
193 


259 


BL008 92 


HIT family proteins. 


BL00892A 18.17 S.SOOe- 
13 60-91 


262 


BL00388 


Proteasome A- type 
oubunits proteins. 


BL00388A 23.14 l.OOOe- 
40 8-54 BL00388B 
31.38 3.864e-33 66-108 
BL00388D 20.71 l.OOOe- 
21 153-184 BL00388C 
18.79 8.147e-16 126- 
14 8 


2*4 . - 


BL00903 


Cytidine and : 
deoxycy t idyla t e 
deaminases zinc -binding 
region s. 


BL00903 12.93 5.821e- 
09 91-101 


267 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 241-257 


270 


BLO0226 


Intermediate filaments 
proteins. 


BL00226D 19.10 l.OOOe- 
37 362-409 BL00226B 



1?9 
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NO. 
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23.86 8.043e-35 196- 
244 BL00226C 13.23 
7.000e-20 261-292 
BL00226A 12.77 6.143e- 
15 96-111 


271 


PD02952 


KINASE TRANSFERASE 
CHOLINE PROTEIN 
MULTIGENE FAMI. 


PD02952C 15.76 9.731e- 
16 235-265 PD029S2B 
15.57 5.625e-09 215- 
229 


272 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 l.OOOe- 
40 106-160 PD02929B 
18.36 8.800e-17 179- 
199 


2 74 


BL01O27 


Glycosyl hydrolases 
family 39 proteins. 


BL01027B 15.34 3.486e- 
09 213-250 


275 


PR00424 


ADENOSINE RECEPTOR 
SIGNATURE 


PR00424D 14.32 6.451e- 
11 39-59 


277 


BL00052 


Ribosomal protein S7 
proteins . 


BL00052A 27.85 S.OOOe- 
13 137-184 BL00052B 
15.17 S.143e-12 208- 
235 


279 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 5.659e- 
13 267-294 


260 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 1.000e-21 89-105 
PR00319A 15.27 8.364e- 
21 51-68 PRO0319B 
11.47 8.200e-19 70-85 


281 


PRO0319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR0031SD 11.64 6.625e- 
23 94-112 PR00319C 
13.41 1.000e-21 76-92 
PR00319A 15.27 8.364e- 
21 38-55 PR00319B 
11.47 8.200e-19 57-72 


287 


PF00929 


Exonuclease . 


PF00929D 16.17 7.366e- 
09 149-163 


291 


BLO0326 


Tropomyosins proteins . 


BL00326A 14.01 2.360e- 
09 93-127 


292 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


294 


PDO0O66 


PROTEIN ZINC- FINGER 
METAL -BIND I . 


PD00066 13.92 8.714e- 
12 203-216 


295 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 5.500e- 
15 322-339 BL00028 
16.07 9.471e-14 433- 
450 BL00028 16.07 
4 .600e-13 648-665 
BL00028 16.07 5.500e- 
13 760-777 BL00028 
16.07 9.550e-13 788- 
805 BL00028 16.07 
3.348e-12 704-721 
BL00028 16.07 6-478e- 
12 461-478 BL00028 
16.07 8.435e-12 844- 
861 BL00028 16.07 
1.692e-ll 593-610 
BL00028 16.07 2.038e- 
11 211-228 BL00028 
16.07 S.154e-ll 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
BL00028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.654e-ll 564-581 
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BL00028 16.07 4.086e- 
09 517-534 BL00028 
16.07 7.429e-09 489- 
506 


296 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 8.333e- 
16 111-136 BL00215A 
15.82 2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.375e-10 59-72 
BL00215A 15.82 9.824e- 
10 205-230 


302 


PP00953 


Glycosyl transferase. 


PF00953C 19.70 8.773e- 
34 236-269 PF00953A 
19.68 5.000e-25 102- 
129 PF00953B 6.17 
1.000e-13 182-194 


304 


PP00152 


tRNA synthetases class 
II . 


PF00152D 21.30 8.364e- 
28 422-461 PF00152C 
28.03 9.250e-21 220- 
257 PF00152B 15.67 
2.6S8e-13 159-184 
PF00152A 19.68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN ZINC FIKGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.250e- 
35 37-76 


305 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . 


PD02784B 26.46 5.840e- 
09 92-135 


307 


PR0O4 54 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


309 


r>P ft n "> 1 *7 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237E 13.03 S.091e- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e- 
10 230-255 PR00237B 
13. SO 9.438e-10 57-79 


309 


BL00522 


DNA 'polymerase family x 
proteins . 


BL00522C 11.90 7.577e- 
24 315-339 BL00522F 
14.90 1.310e-15 470- 
494 BL00522A 25.52 
1.265e-14 179-226 
BL00522E~19.63 8.615e- 
14 430-460 BL0052.2B 
-i/.30 9.625e-12 267- 
313 


310 


BL00326 


Tropomyosins proteins . 


.dxju uJZbU 0. /fa o.^3i>e— 
10 856-897 


312 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


ouu UiiyuA zu.U? 4 . 7Q6e- 
14 151-174 BL00290B 

229 


313 


BL00345 


Ets-domam proteins. 


BL00345B 21.28 1 . OOOe- 
40 34-85 BL00345A 
13.96 9.217e-16 1-20 


"315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 5.09-le- " 
IS 63-76 


317 


BL01020 


SARI family proteins. 


BL01020C 15.35 3.198e- " 
17 79-130 


318 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.696e- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 


PR00109B 12.27 4.814e- 
10 216-235 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 




321 


BL00027 


1 Homeobox * domain 
proteins . 


BL00027 26.43 5.688e- 
10 329-372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 8.765e- 
12 558-577 


324 


BL01241 


Link domain proteins . 


BL01241 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222C-13 282- 
335 


326 


BL00412 


Neuromodulin (GAP- 43) 
proteins. 


BL00412D 16.54 4.000e- 
12 515-566 BL00412D 
16.54 5.705e-ll 516- 
567 BL00412D 16.54 
7.848e-10 518-569 
BL0O412D 16.54 1.827e- 
09 514-565 BL00412D 
16.54 1.918e-09 513- 
564 BL00412D 16.54 
2.102e-09 520-571 


328 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
BL00232B 32.79 5.985e- 
18 370-418 BL00232B 
32.79 5.500e-16 258- 
306 BL00232B 32-79 
9.3B4e-15 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.65 4.326e-ll 368- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


330 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


331 


BL00598 


Chroroo domain proteins. 


BL00598 14.45 8.393e- 
18 27-49 


333 


BL01016 


Glycoprotease family 
proteins. 


BL01016C 22.84 3 . 925e- 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.298e- 
11 127-140 BL01016G 
7.14 5.622e-10 261-271 
BL01016A 5.65 7.167e- 
10 4-19 BL01016F 
13.34 1.563e-09 200- 
212 BL01016B 8.93 
8.855e-09 38-50 


339 


BL01115 


GTP- binding nuclear 
protein ran proteins . 


BL01115A 10.22 5.500e- 
11 17-61 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.231e- 
33 10-49 


341 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5-042e- 
09 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.400e- 
30 16-55 


343 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 l.OOOe- 
40 2Q-68 


346 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12-27 4.764e- 
11 135-154 


347 


PR00109 


TYROSINE KINASE 


PR00109B 12.27 4.764e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






CATALYTIC DOMAIN 
S IGNATURE 


11 135-154 


351 


BL01187 


Calcium- binding EGF-like 
domain proteins pattern 
proteins . 


BL01187B 12. D4 1.783e- 
13 100-116 BL01187B 
12.04 8.435e-13 276- 
292 BL01187B 12.04 
8.800e-ll 13-29 
BL01187B 12.04 7.429e- 
10 54-70 BL01187B 
12.04 5.725e-09 231- 
247 BL01187A 9.98 
7.000e-09 255-267 


352 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR . 


PD00078B 13.14 5.950e- 
• 10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


354 




Rhodanese proteins . 


BL00380F 9.76 6.694e- 
11 542-553 


35S 


PF00628 


PHD- finger^ " 


PF00628 15.84 l.OOOe- 
11 116-131 


356 


PR00587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 9.700e- 
09 17-37 


359 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 4.462e- 
15 261-274 PD0006S 
13.92 6.500e~13 233- 
246 PD00066 13.92 
4.300e-09 289-302 


361 


PF00791 


Domain present in ZO-1 
and Unc5-like net r in 
receptors . 


PF00791B 28.49 9.604e- 
13 54-109 PF00791B 
28.49 1.095e-12 21-76 
PF00791A 27.85 1.432e- 
09 71-126 PF00791B 
28.49 7.440e-09 184- 
239 


362 


PF00791 


Domain present in ZO-1 
and Unc5-like netrln 
receptors. 


PF00791B 28.49 2.273e- 
11 279-334 


363 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 5.080e- 
10 73-95 PR00450C 
12.22 3.278e-09 109- 
131 


364 


PF00242 


DNA polymerase (viral) 
N- terminal domain 
proteins . 


PF00242Q 13.51 2.328e- " 
09 22-68 


365 


PF00242 


DNA polymerase (viral) 
"N- terminal domain 
proteins . 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


BL0116O 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 1038-1092 


367 


ppnnm a 


L3UCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 
09 370-384 


366 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 9.000e- 
15 30-49 PROOOllA 
14.06 9.830e-15 30-49 
PROOOllB 13.08 4.500e- 
14 30-49 PROOOllC 
24.25 S.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032H 11.25 4.150e- 
09 417-430 


372 


BL00478 


LIM domain proteins . 


BL004 78B 14.79 7.750e- 
12 410-425 


373 
376 


PD01066 
FK00170 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 

SODIUM CHANNEL SIGNATURE 


PD01066 19.43 9.757e- 
34 26-65 

PR00170E 6.4 8 2.739e- 
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NO. 


DESCRIPTION 


RESULTS* 








10 88-118 


380 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 1 . OOOe- 
23 276-307 BL00107B 
13.31 1.692e-12 342- 
358 


381 


BL004 55 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 S.714e~ 
12 50-66 


382 


PR00624 


HI STONE H5 SIGNATURE 


PR00624G 4.08 4.900e- 
09 524-544 


384 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ankyr. 


PD00078B 13.14 5.950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


385 


PR00511 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 67-80 


386 


PDQ2870 


RECEPTOR INTKRLEUKIN-1 
PRECURSOR . 


PD02870B 18.83 6.000e- 
10 97-130 


383 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BINDI . 


PD00066 13.92 S.OOOe- 
13 516-529 


389 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 7.6S7e- 
09 151-174 


390 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 5.200e- 
15 221-246 BL00215A 
15-82 7.618e~14 20-45 
BL00215A 15.82 8.851e- 
11 123-148 BL00215B 
10.44 9.526e-ll 69-82 
BL00215B 10.44 7.300e- 
09 272-285 BL00215B 
10.44 8.500e-09 165- 
178 


394 


BL00674 


AAA-protein family- 
proteins. 


BL00674B 4.46 2 . 723e- 
16 299-321 


397 


PR00048 


C2H2 -TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.579e- 
11 141-155 


398 


PRO 07 61 


BINDIN PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e- 
09 55-74 


399 


BLO0240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 7.907e- 
10 118-142 


401 


PFO0676 


Dehydrogenase El 
component . 


PF00676B 24.71 8.071e- 
18 331-369 PF00676D 
14.40 3.854e-lS 486- 
506 PF00676C 16.88 j 
9.182e~14 454-478 


402 


BL00514 


Fibrinogen beta and 
gamma chains C- terminal 
domain proteins. 


BL00514C 17.41 4.673e- 
28 4432-4469 BL00514G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 
BL00514F 11.65 4.288&- 
10 4519-4534 BL00514H 
14.95 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin. 


PF00992A 16.67 5.974e- 
09 105-140 


404 


PROO019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.450e- 
10 73-87 PR00019A 
11.19 B.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B 
11.36 1.000e-09 96-110 


405 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9-557e- 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BIi00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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NO. 
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RESULTS* 








294 BL00232B 32.79 
9.384e-l5 463-511 
BL00232C 10.65 2.537e- 
12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL00232C 10.65 
7.261e-ll 461-479 
BL00232C 10.65 7.457e- 
11 27-45 


4 07. 


PFO0426 


Outer Capsid protein VP4 
{Hemagglutinin) . 


PF00426S 15.67 5.634e- 
09 902-940 


409 


BL01160 


Kinesln light chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


410 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2.731e- 
09 252-275 


411 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6.344e- 
09 86-100 


412 


BL00603 


Thymidine kinase 
cellular- type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 3.571e- 
31 245-291 BL00866C 
23.26 9.000e-25 331- 
366 


418 


PR0023 9 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- 
09 590-602 


421 


PF00791 


Domain present in ZO-1 
and Unc3-like netrin 
receptors . 


PF00791B 28.49 7.9S5e- 
14 23-78 PF00791B 
28.49 3.653e-12 273- 
328 PF00791B 28.49 
4 .273e-ll 156-211 
PF00791B 28.49 7.818e- 
11 89-144 PF00791B 
28.49 1.524e-10 56-111 
PF00791C 20.98 3.559e- 
09 37-76 PF00791C 
20.98 5.235e-09 170- 
209 PF00791C 20.98 
5.235e-09 381-420 
PF00791B 28.49 6.202e- 
09 189-244 PF00791B 
28.49 7.028e-09 435- 
490 PF00791B 28.49 
8.679e-09 367-422 


424 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1645-1679 


425 


PRO0109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 5.88ie- 
10 228-251 


429 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.600e- 
11 31-40 


431 


BL00039 


DEAD- box subfamily ATP- 
dependent he li cases 
proteins. 


BL00039D 21.67 l.B44e- 
34 490-536 BL00039A 
18.44 5.615e-19 205- 
244 BL00039B 19.19 
8.920e-l6 251-277 
BL00039C 15.63 5.781e- 
J.3 -> o J - Jb / 


432 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B il.65 7.652e- 
12 169-185 


433 


PR00828 


FORMIN SIGNATURE 


PR0O828B 5.23 8.218e- 
10 382-405 


436 


BL00415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3_036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 


PF01140 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 






p!5. 


10 183-218 PF01140D 
15.54 3.093e-09 246- 
281 


449 


PR00568 


DOPAMINE D3 RECEPTOR 
SIGNATURE 


PRC0568G 13.95 5.551e- 
09 39-53 


451 


PF00084 


Sushi domain proteins 
{SCR repeat proteins. 


PF00084B 9.45 3.8l3e- 
10 47-59 


452 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2 . 821e- 
09 618-649 


456 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 l.COOe- 
25 77-99 PR00380D 
9.93 l-000e-21 281-303 
PR00380C 13.13 8.286e- 
17 230-249 PR0O380B 
12.64 4.724e-16 194- 
212 


457 


PR00253 


GAMMA- AMINOBOTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 9.143e- 
24 246-267 PR00253B 
13.47 2.000e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR00253D 16.68 S.950e- 
21 452-473 


467 


PR00849 


GLYCOSYL HYDROLASE 
FAMILY 58 SIGNATURE 


PR00849D 9.77 9.236e- 
09 910-937 


471 


BL0067B 


Trp-ASp (WD) repeat 
proteins proteins . 


BL00678 9.67 8.200e-12 
33-44 


472 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 3 . 72le- 
09 282-330 


473 


BL00344 


GATA-type zinc finger 
domain proteins . 


BL00344 17.99 7.000e- 
12 814-852 


474 


BL004 81 


Thiol -activated 
cytolysins proteins. 


BL00481E 13.07 8.909e- 
09 173-199 


479 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PRD0319B 11.47 2.571e- 
09 393-408 


480 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.900e- 
38 8-47 


481 


PR00405 


"HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR0O4O5C 19.41 l.OOOe- 
19 451-473 PR00405B 
11.83 4.333e-18 430- 
448 PR0040SA 17.71 
4.971e-18 411-431 


482 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.286e- 
10 959-974 PR00049D 
0.00 9.857e-10 958-973 
PR00049D 0.00 1.3 05e- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


486 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 8 . 615e- 
23 653-673 PR00007A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
5.846e-19 698-720 
PR00007D 9.64 3.647e- ! 
13 732-743 


487 


PD00567 


PROTEIN RNA- BINDING RNA 

■ptr"DT« aT* uvn 
RCircHl nXU. 


PD00567B 18.23 2.853e~ 

no onft *ti a 
13 zuu- Z.X9t 


488 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e- 
12 3-21 


489 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-10 71-110 


490 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.864e- 
09 663-678 


492 


BL01128 


Shikimate kinase 
proteins . 


BL01128A 18.84 6-464e- 
17 58-92 


497 


PF00429 


ENV polyprotein (coat 


PF00429 31.08 7.171e- 



206 



BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



PCT/USO0/34263 



SEQ ID NO: 
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NO. 


DESCRIPTION 


di?pitt rn o x 
KlibULtlia* 






polyp rotein) 


15 21-71 


498 


BL00120 


Lipases, serine 
proteins . 


BL00120B 11.37 7 . 923e- " 
09 185-200 


500 


BL00030 


Eukaryotic RNA-binding 
reaion RNP-1 nmi-f* -J r»*s 


BL00030A 14.39 7.353e- 


501 


BL01159 


WW/rsp5/WWP domain 
piroteins . 


BL01159 13.85 8.579e- 

-L^I JU 3 i. — J.*»b 


505 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 3.739e- 


508 


PR00120 


H+TRANSPORTING ATPASE 
(PROTON PUMP) <?Tf?NfiTTTRP 


PR0012OC 9.90 5.800e- 


50 9 


DM014 17^ 


6 kw INDUCING XPMC2 
MUSHROOM SPAC2 2G7 . 04 . 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D 
11.08 3.800e-13 322- 
33e 


510 


PF00534 


Glycosyl transferases 
group 1 . 


PF00534B 14.47 6.625e- 
09 346-370 


511 


PF00534 


Glycosyl transferases 
group 1. 


PF00534B 14.47 6.62Se- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group 1. 


PF00534B 14.47 6.625e- 
09 366-390 


513 


PD01841 


PHOSPHORYIASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 110-160 PD01841B 
14.35 l.OOOe-40 181- 
222 PD01841D 17.87 
l.OOOe-40 243-295 
PD01841F 13.36 l.OOOe- 
40 333-382 PD01841G 
24.26 l.OOOe-40 386- 
440 PD01841L 18.42 
l.OOOe-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-804 PD01841E 
18.60 3.750e-36 295- 
333 PD01B41J 14 . 94 
6.023e-3S 851-888 
PD01841II 21.30 2.909e- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 
954 PD01841C 13.78 
9.3B6e-23 222-243 
PD01841M 10.82 8.594e- 
21 1054-1073 PD01841I 
23.00 2.667e-13 549- 
591 


514 


PR00153 


CYCLOPHIL1N PEPTIDYL- 
ISOMERASE SIGNATURE 


PR00153C 11.01 7.188e- 
13 95-111 PR00153E 
9.10 4.150e-12 122-138 


51S 


BL0074O 


MAM domain proteins. 


BL00740A 13.87 7.183e~ 
12 410-423 


516 


DMO0892 


J KislKUvlKAJb PKC/IEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpna chain 
proteins . 


BD00242C 16. 86 8.320e- 
09 12-42 


523 


DM0 0031 




DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 1.000e-25 84-118 


525 


BL00319 


Amy 1 oi dog en i c 
glycoprotein 
extracellular domain 
proteins . 


BIi00319C 17 12 8 3750. 
10 61-95 


526 


PF00789 


Domain present in 
ubiqu i t in - regu 1 a tor y 
proteins. 


PF0O789B 19.70 3.308e- 
12 322-343 PF00789C 
20.98 5.269e-09 367- 
392 


528 


BL01162 


Quinone oxidoreductase / 
zeta-crystallin 
proteins . 


BL01162C 22.80 1.500e- 
16 120-164 
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SEQ -D NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


529 


PRO 0910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 3.893e- 
09 60-73 


S32 


BL0021S 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 123- 
148 


533 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 97-122 


534 


BL00098 


Thiolases a cyl- enzyme 
intermediate proteins. 


BL00098C 21.65 2.800e- 
38 181-227 BL00098B 
32.59 5.345e-38 86-141 
BL00098D 26.30 8.364e- 
35 245-288 BL00098E 
22.12 1.000e-34 314- 
352 BL00098F 10.18 
4 .971e-22 365-386 
BL00098A 10.60 6.455e- 
11 38-50 


535 


PR00370 


FLAVIN- CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 


PR00370E 11.96 7.429e- 
22 321-340 PR00370D 
16.33 6.143e-21 185- 
204 PR00370F 17.75 
6.559e-21 376-396 
PR00370B 10.91 9.591e- 
21 27-46 PR00370C 
12.72 3.500e-20 140- 
157 PR00370A 3.35 
6.442e-17 4-20 


536 

! 




domain proteins. 


BLOO028 16.07 7.429e- 
16 285-302 BL00028 
16.07 6.294e-14 341- 
358 BL00028 16.07 
1.346e-ll 369-386 
BLOC028 16.07 1.692e- 
11 397-414 BL00028 
16.07 4.4S2e-ll 453- 
470 BL00028 16.07 
7.231e-ll 425-442 
BL00028 16.07 4.300e- 
10 313-330 


537 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.4 3 9.419e- 
15 844-881 


538 


BL00762 


VJHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 819-856 


539 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-859 


540 


PRO 0985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR00985A 12.10 9.000e- 
10 357-375 


541 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 l.OOOe- 
40 3-47 PD02102B 
18.28 4.375e-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 8.929e~26 100- 
146 


543 


EIiO0O28 


Zinc finger, C2H2 type, 
domain proteinB. 


BL00028 16.07 l.OOOe- 
10 48-65 BL00028 
16.07 6.400e-10 193- 
210 BL00028 16.07 
l.OOOe-09 343-360 
BL00028 16.07 6.914e- 
09 78-95 


545 


BL00250 


TGF-beta family 
proteins . 


BL002S0A 21.24 8.000e- 
31 293-329 BL002S0B 
27.37 5.286e-24 354- 
390 


547 


PRO 03 19 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






(TRANSDUCIN) SIGNATURE 


09 106-201 PR00319A 
15.27 7.344e-09 210- 
227 


54 8 




NF- kappa -B/Rel/ dorsal 
domain proteins. 


RL01204A 17.74 l.OOOe- 
40 8-56 BL01204D 
16.42 1.000e-40 177- 
221 BL01204E 13.83 
7.652e-33 225-250 
BL01204C 13.93 8.714e- 
22 141-160 BL01204B 
15.41 4.333e-16 102- 
116 


54 9 


PR00326 


GTP1/OBG GTP-BINDING 


PR00326A 8.75 8.364e- 
15 255-27G 


551 


PF00632 


HECT-domain (ubiquitin- 
transferase) . 


PF00632C 20.66 3.302e- 
23 1569-1601 PF0O632B 
18.45 3.700e-21 1515- 
1543 


554 




Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 l.GOOe- 
14 187-205 BL00290A 
20.89 2.059e-14 130- 
153 


557 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.339e- 
09 846-879 


559 


DM01111 


4 kw PHOSPHATASE 
TRANSFORMING 61K PDF1 . 


DM01111L 11.93 3.762e- 
09 7-35 


562 


ire UU03O 


Poly- adenylate binding 
protein, unique domain 
proteins . 


PF00658C 16.33 9.455e~ 
32 118-155 


564 


BL00141 


Eukaryotic and viral 
aspartyl proteases 
proteins . 


BL00141A 12.10 4.150e- 
10 472- 488 


566 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e- 
15 272-289 


567 


nn r» 1 r» 
FDOIO06 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4 . 977e- 
13 229-268 


569 


BL00107 


Protein kinases AT?- 
binding region proteins. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


570 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1 . 857e- 
34 454-483 PR00193C 
12.60 2.636e-3l 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 508- 
537 


573 


rnu yj x. ^7 j 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 470-499 PR00193C 
12.60 2.636e-31 239- 
267 PR0019"*fi 11 69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 524- 
S53 


575 


BL00752 


XPA protein. 


BL00752B 19.17 9 . 703e- 
10 885-929 


576 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7-OOOe- 
09 276-295 


577 


BL00116 


DNA polymerase family B 


BL00116A 12.81 5.737e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins . 


13 864-877 BL00116B 
11.82 l.S29e-12 952- 
965 


578 


BIj00195 


Glutaredoxin protein*? . 


BL00195B 15.31 7.158e- 
09 121-141 


579 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PRO0019B 11.36 9.000e- 
11 217-231 PR00019B 
11.36 1.360e-09 386- 
400 PR00019A 11.19 
3.333e-09 389-403 
PR00019B 11.36 8.920e- 
09 363-377 


580 


PR00253 


GAMMA-AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR002S3A 9.15 2.125e- 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
5.846e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 


583 


PR00343 


SELECTIN SUPERFAMILY 
COMPLEMENT- BINDING 
REPEAT SIGNATURE 


PR00343C 16.85 2.286e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
5.500e-ll 783-802 
PR00343C 16.85 4.246e- 
10 1491-1510 PR00343C 
16.85 8.230e-10 1686- 
1705 


584 


DM01537 


kw SKI2W SKI2 NUCLEOLAR 
HELICASE . 


DM01537B 21.63 1.878e- 
37 79-126 DM01537B 
21.53 9.491e-30 916- 
963 DM01537A 15.14 
3.196e-ll 784-804 


586 


PFC0013 


KH domain proteins 
family of RNA binding 
proteins . 


PF00013 5.78 1.450e-09 
124-136 


587 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 4_409e- 
13 262-296 


589 


BL00478 


LIM domain proteins. 


BL00478B 14.79 1.643e- 
13 261-276 BL00478B 
14.79 7.709e-09 321- 
336 


590 


PF00855 


PWWP domain proteins. 


PF00855 13.75 8.000e- j 
15 931-948 


591 


PF008S5 


PWWP domain proteins . 


PF00855 13.75 8.000e- 
15 1062-1079 


593 


PF00628 


PHD- finger . 


PF00628 15.84 3.455e- 
12 424-439 


594 


PRO 02 05 


CADHERIN SIGNATURE 


PR00205B 11.39 2.24le- 
16 558-576 PR0020SA 
14.73 9.308e-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 


596 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 4-789e- 
18 307-338 


598 


PD01675 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . 


PD01675C 19.89 2.330e- 
10 55-39 


600 


BI»00242 


Integrins alpha chain 
proteins. 


BL00242E 9.03 9.591e- 
27 98S-1014 BL00242C 
16.86 4.1l5e-26 286- 
316 BL00242D 13,57 
4.150e-25 357-382 
BL00242B 8.13 7.353e- 
12 189-199 BL00242D 
13.57 3.455e-ll 421- 
446 BL00242A 13.80 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








5.000e-ll 61-73 
10 291-316 


601 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


rKUUJiJUH lo . f *k O.olOe- 

09 198-213 


602 


PR00278 


PANCREATIC HORMONE 
SIGNATURE 


10 331-348 


603 


BL00479 


Phorbol esters / 
diacyl glycerol binding 
domain proteins . 


OUV/v? / JV- _l_ £m , VIA J . ^ jUc — 

12 170-183 


604 


BL00315 


Dehydrins proteins. 


BL00315A 9.35 1.672e- 
09 424-452 


605 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e- 
10 295-339 


606 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 l.OOOe- 
13 335-35B 


608 


PF00855 


PNWP domain proteins. 


PF00855 13.75 S.167e- 
15 265-282 


609 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
15 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN. 


DM01206B 10.69 7.411e- 
10 877-897 DM01206B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9.137e-10 873-893 
DM01206B 10:69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e-09 879- 
899 DM01206B 10.69 
4.076e-09 865-885 
DM01206B 10.69 7.038e- 
09 898-918 DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.69 
8.291e-09 767-787 


61S 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699A 8.91 2.023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
1.000e-17 158-182 


616 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455 ' 


617 


muuj o u 


KJ.NESXN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976S-13 436- 

* 3D 


618 


DM01206 


CORONAVIRUS NUCLEOCAPSID 

PROTPTN 
Ctj.li . 


DM012C6B 10.69 5.143e- 
XZ i>31-551 DM01206B 
10.69 2.603e-10 535- 
555 


621 


PRO 0700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR0O7COB 16.80 3 . 160e- 
21 561-582 


622 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239F 28.15 3 . 222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 


623 


PRO 04 07 


EUKARYOTIC MOLYBDOPTERIN 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448e- 
09 326-339 


624 


BL00641 


Respiratory- chain NADH 
dehydrogenase 75 Kd 


BL00641C 21.10 1 . 000e- 
40 157-202 BL00641E 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






subunit proteins . 


24 .37 1 .0Q0e-40 255- 
308 BL00641F 33 .12 
l.OOOe-40 571-623 
BL00641A 17.15 1 . 818e- 
37 48-80 BL00641B 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 


627 


PRO 01 03 


CAMP- DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103E 17.80 2 . 500e- 
18 367-380 PR00103B 
13.39 2.080e-14 297- 
312 PR00103A 9.59 
2.957e-14 282-297 
PR00103D 10.83 3.077e- 
12 346-358 PR00103C 
15.68 1.000e-ll 334- 
344 PR00103B 13 .39 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 
10 160-175 


S3 D 


PR00081 


GLUCOSE /RIB I TOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 6.211e- 
16 4-22 


631 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 B.SOOe- 
14 37-50 


632 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 2.233e- 
10 1324-1344 DM01206B 
10.69 4.822e-10 1276- 
1296 DM01206B 10.69 
7.658e-10 1328-1348 
DM01206B 10.69 8.274e- 
10 128O-130O- DM01206B 
10.69 4.532e-09 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 


635 


3L00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 7.600e- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
227 


636 


BL00657 


Fork head domain 
proteins. 


BL00657A 19.39 1.545e- 
30 101-143 BL00657B 
22.27 7.750e-26 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
10 607-623 


643 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 4.913e-09 
199-212 


647 


PF00628 


PHD- finger . 


PF00628 15.84 2.350e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 
479 


648 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins . 


BL01129E 13.25 4-0O0e- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
279 BI.01129B 12.51 
6.118e-13 191-212 


649 


BD01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 3.908e- 
10 455-480 


650 


BL00027 


1 Homeobox 1 doma i n 
proteins . 


BL00027 26.43 6.684e- 
13 771-814 


651" 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL5C002A 14.19 1.750e- 
12 1026-1045 


653 


PR00253 


GAMMA- AM INOBUTYRI C ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9-15 4.000e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR00253B 13.47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 422-443 


654 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD01719A 
12.89 3.961e-10 128- 
156 PD01719A 12.89 
7.39Se-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 563-578 


658 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 580-595 


659 


DMO o 2 m 




nrvi nn7i ^ i q d"? *5 nap. 

LfVl U \J A X Z> X 2 - 1 J li.Xf^C 

13 539-572 DM00215 
19 4*^ 4 7^0^-12 ^4 0- 

X J * -J T • / DUC — _1 jL 1 _7 

582 DM00215 19.43 

9 B24e-ll 551-584 
DM00215 19.43 2.929e- 

10 548-5A1 DM00215 
19.43 4.054e-lC 550- 

583 DM00215 19 .43 
5.339e-10 552-585 
DMO0215 19.43 7.107e- 
10 544-577 


660 


PR00688 


XYLOSE ISOMERASE 
SIGNATURE 


PRO0688I 13.78 9.518e- 
09 224-236 


661 


BL00 027 


' Homeobox 1 domain 
proteins. 


BL00027 26.43 5.950e- 
23 249-292 


662 


PR0036O 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


663 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158a- 

X V v A v» 


664 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


666 


PRO 0819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


ppnnRiOR in fti r Qnn»- 

TT iv \J U O X S J.U t U J □ , 7 UOw 

10 704-720 


667 


BL5O04O 


Elongation factor 1 
gamma chain profile. 


BL50040C 22.62 2.143e- 
16 135-178 


66B 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


pDAQOlQB 11 36 l 3fiOe- 

lr rv Wul JO JO X . J O w c 

09 13 9-153 PR00019A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BLOOOIB 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 3.25Ce-10 
681-694 BL00018 7.41 
6.400e-10 717-730 


672 


PD0O131 


ATP-BINDING TRANSPORT 
TRANSMEMBR . 


PD00131B 34.97 l.OOOe- 
34 356-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR0O667 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PR00667G 15.33 7.557e- 
10 106-123 


674 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR003 20A 16.74 4.857e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
650 PR00320C 13.01 
8.435e-ll 717-732 ! 
PR00320C 13.01 2.800e- 
10 635-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12.19 
3.2S0e-09 593-608 


675 


PRDO320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.1l5e-12 614- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








629 PR00320C 13.01 
8.435e-ll 696-711 
PR00320C 13.01 2.800e- 
10 614-629 PR00320C 
13 .01 6.400e-10 572- 
587 PR00320B 12.19 
3.250e-09 572-587 


676 


PRO 0 0 19 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11. IS 9.667e- 
09 249-263 


679 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 3.700e- 
16 225-236 PF00642 
11.59 7.900e-12 187- 
198 


680 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 8.754e- 
10 286-296 


681 


BL00019 


Actinin-type actin- 
binding domain proteins. 


BL00019D 15.33 4.200e- ! 
19 227-257 


682 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4 . OOOe- 
09 99-118 


687 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8-S00e- 
10 538-553 


689 


BL01024 


Protein phosphatase 2 A 
iregu.1 a tory subuni t PR55 
proteins. 


BL01024A 10.26 1.000s- 

dU ji^-t)? BJ-iu X\) £*t a 

8.91 l.OOOe-40 86-127 

TiT.f* i niac *7 ro i nnrie- 
DituiuziL / ♦ ty\j x .uuuc 

40 146-18S BL01024D 

1 J , 1 , WUUC IV JL O 3 

222 BL01024E 11.96 

I. OOOe-40 222-266 
BL01024F 9.42 l.OOOe- 
40 266-317 BL01024G 

II. 09 l.OOOe-40 317- 
349 BL01024H 13.88 
l.OOOe-40 389-442 


691 


BL00027 


» Homeobox • doma i n 
proteins. 


BL00027 26.43 8.071e- 
31 152-195 


692 


BL00211 


ABC transporter's family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


693 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


694 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 58-70 


696 


BL006 80 


Methionine 

aminopeptidase subfamily 
1 proteins. 


BL0068O 14.37 5.304e- 
17 173-195 


697 


BL00741 


Guanine - nucleot ide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 3.418e- 
11 242-265 


698 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM01930E 15.41 1.367e- 
37 170-215 DM01930F 
14.16 8.232e-28 267- 
303 DM01930B 19.86 
9.163e-10 37-71 


700 


PR0O869 


DNA- POLYMERASE FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


701 


PR0004 8 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.174e- 
10 77-91 PR00048A 
10.52 6.870e-10 133- 
147 PR0004 8A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 


702 


BL00523 


Sulfatases proteins - 


BL00523E 19.27 2.565e- 
25 326-356 BL00523A 
13.36 5.050e-l6 38-55 
BL00523B 8.64 5.909e- 
15 86-98 BL00523C 
12.64 5.500e-13 137- 
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148 BL00523D 9.89 
1.844e-ll 290-302 
BL00523G 9.46 5.500e- 
10 S13-S23 BL00523F 
10.85 6.351e-09 413- 
424 


703 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.412e- 
12 376-390 PR00048B 
6.02 1.000e-10 334-344 
PR00048B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD007G7A 14.84 8.941e- 
14 66-82 


708 


PR00761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761E 14.32 8 . 500e- 
10 822-841 


712 


DM01354 


kw TRANSCRIPTASE REVERSE 
II ORF2. 

• 


DM01354Y 10.69 4 . 977e- 
38 425-465 DM01354X 
13.86 7.300e-34 376- 
415 DM01354V 12.97 
4.923e-17 311-358 
DM01354W 12.64 5.596e- 
10 356-376 


713 


BL00O39 


DEAD -box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 7.54Se- 
27 4SO-496 BL00039A 
18.44 2.537e-18 147- 
186 BL00039C 15.63 
2.2l6e-14 280-304 
BL00039B 19.19 1.947e- 
13 194-220 


715 


BL003 83 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383E 10.35 4.981e- 
10 150-161 


717 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 3 . 750e- 
39 20-68 DM00031B 
15.41 2.688e-28 04-118 
DM00031C 12.79 1.300e- 
12 131-142 


719 


BL00243 


Integrins beta chain 
cysteine-rich domain 
proteins. 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 BL00243D 24.07 
1.0O0e-40 222-274 
BL00243F 22.63 l.OOOe- 
40 314-358 BL00243I 
31.77 6.571e-39 607- 
650 BL00243E 16.70 
3.077e-35 274-304 . 
BL00243G 21.38 3.625e- 
34 358-400 BL00243H 
17.53 5.235e-29 567- 
593 BL0O243A 17.61 
3.250e-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 


720 


PR00217 


43 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PR00217C 10.91 8.022e- 
09 20-36 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE (C2) FAMILY 
SIGNATURE 


PR00704D 11.05 5.909e- 
34 135-161 PR00704F 
13.61 7.000e-26 190- 
218 PR00704E 12.55 
8.07le-26 165-189 
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PR00704B 17.94 2.241e- 
23 75-98 PR00704A 
14.68 4.094e-19 30-54 
PR00704C 11.88 1.871e- 
18 99-116 


725 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e~ 
09 169-187 


726 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


727 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13-01 2.125e- 
13 277-292 PR00320A 
16.74 1.310e-ll 277- 
292 PR00320C 13.01 
4.522e-ll 323-338 
PR00320A 16.74 6.586e- 
11 323-338 PR00320B 
12.19 4.343e-10 323- 
338 PR00320B 12.19 
6.914e-10 277-292 


731 


PR0O195 


DYNAMIN SIGNATURE 


PR00195A 11.94 8.627e- 
16 288-307 PR00195E 
9.82 3.912e-ll 457-474 


733 


PF00642 


Zinc finger C~x8-C-x5-C- 
X3-H type (and similar) . 


PF00642 11.59 9.082e- 
10 787-798 


738 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BL00039A 18.44 2.565e- 
28 26-65 BL00039D 
21.67 2.105e-20 338- 
384 BL00039C 15.63 
9.100e-13 160-184 
BL00039B 19.19 9.617e- 
11 73-99 


739 


BL012B9 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B 
10.45 9.571e-17 353- 
383 


742 


BI>01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 7.078e- 
12 41-81 


743 


BL00965 


Phosphoraannose isomerase 
type I proteins. 


BL00965C 23.78 l.OOOe- 
40 256-305 BL00965B 
17.77 1.600e-25 126- 
153 BL00965A 10.57 
6.400e-19 94-113 


74 7 


BL00021 


Kringle domain proteins. 


BL00021D 24.56 4.563e- 
25 231-273 BL00021B 
13.33 5.34Se-21 60-78 


748 


BL00612" 


Osteonectin domain 
proteins. 


BL00612B 11.35 2.034e- 
11 93-126 


749 


PR00450 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 6.880e- 
10 135-157 


752 


BL00795 


Involucrin proteins . 


BL00795C 17.06 6.000e- 
11 384-429 BL00795C 
17.06 9.444e-ll 370- 
415 


754 


BL00051 


Ribosomal protein 1*3 9e 
proteins. 


BL00051 20.92 1.93 5e- 
16 4-50 


755 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 7.723e- 
09 171-184 


760 


BL0102O 


SARI family proteins. 


BL01020C 15. 3S 9.020e- 
12 99-150 


762 


3L0004 6 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 33-88 


763 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 9.137e- 
10 206-240 


764 


BL00027 


» Homeobox • domain 
proteins . 


BL00027 26.43 8.800e- 
29 417-460 


767 


BL012 0 8 


VWFC domain proteins. 


BL01203B 15.83 6-063e- 
lO 309- 324 BL01208B 
15.83 8.031e-10 165- 
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180 BL01208B IS . 83 
4 .l62e-09 85-100 


770 


BLO0031 


Nuclear hormones 
receptors DNA-binding 
region proteins . 


BL00031A 19.55 9.57le- 
32 -208-241 BL00031B 
22.25 5.S00e-27 242- 
274 


772 


PRO0449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.450e- 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00449C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 
13 107-121 PR00449B 
14.34 3.455e-ll 27-44 


773 


BL0O523 


Sulfatases proteins. 


BL00523E 19.27 9.333e- 
23 299-329 BL00523A 
13.36 2.200e-13 47-64 
BL00523B 8.64 2.60/e- 
13 91-103 BJ,00523D 
9.89 7.923e-12 224-236 
BLC0523C 12.64 4.512e- 
10 141-152 BL00523F 
10.85 5.82le-10 373- 
384 


775 


BL0O028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.6B6e- 
09 568-585 


776 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 621-638 


111 


BL0O028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 595-612 


778 


BL0003 0 


Eukaryotic RNA- binding 
region RNP-l proteins. 


BL00030A 14.39 8.412e- ' 
11 322-341 BL00030A 
14.39 7.000e-10 220- 
239 


779 


PR00079 


GLUCOSE - 6 - PHCS PHATE 
DEHYDROGENASE SIGNATURE 


PR00079B 12.98 2.929e- 
26 193-222 PR00079E 
16.65 4.150e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13.51 7.070e- 
16 264-281 PR00079A 
16.12 6.769e-13 169- 
183 


781 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 9.250e- 
17 10-35 BL00215A 
15.82 6.000e-16 221- 
246 BL00215A 15.82 
7.857e-12 108-133 
BL00215B 10.44 9.526e- 
11 168-181 


783 


PD00239 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 6.276e-09 
159-173 


785 


BL00690 


DEAH-box subfamily ATP- 
clependent he li cases 
proteins . 


BL00690B 13.38 1 . OOOe- 
12 147-165 BL00690A 
6.87 5.320e-10 114-124 
oijOUbjOC 7.51 3.189e— 
09 218-228 


786 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17 .27 8-500e- 
16 50-73 PR00449A 
13.20 5.235e-14 8-30 
PR00449E 13.50 2.853e- 
11 150-173 PR00449D . 
10.79 1.545e-09 111- 
125 


788 

790 " " 


DM012 06 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 8.767e- 
10 1-21 




BLO0915 


Phosphatidyl inositol 3- 
and 4 -kinases proteins. . 


BL00915C 22.43 9.182e- 
39 725-764 BL00915B 



217 



BNSDOCID: <WO. 



.0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








22.78 5.050e-33 633- 
671 BL00915D 27.02 
1.529e-21 795-831 
BL00915A 10.09 l.OOOe- 
13 395-407 


791 


PR00208 

/ 


GLIADIN AND LMW GLUTEN IN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 6.294e- 
10 120-138 PR00208A 
12.59 6.294e-10 121- 
139 PR00208A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR00208A 
12.59 6.294e-10 124- 
142 PR00208A 12.59 
6.294e-l0 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 
6.294e-10 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PR00208A 12.59 
7.658e-09 131-149 
PR00208A 12.59 7 . 904e- 
09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR00208A 12.59 
8.274e-09 119-137 


795 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e-ll 284- 
300 PR00205C 13.65 
1.333e-ll 337-352 


796 


BL00412 


Keuromodul in (GAP -43) 
proteins - 


BL00412D 16.54 4 . OOOc 
12 196-247 BL00412D 
16.54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 l.B27e- 
09 195-246 BL00412D 
16.54 1.918e-09 194- 
245 BL00412D 16.54 
2.102e-09 201-252 


797 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 6.339e- 
13 40-58 | 


799 


BL01052 


Calponin family repeat 
proteins . 


BL01052C 18.51 l.OOOe- 
40 87-127 BL01052A 
16.12 1.529e-32 3-35 
BL01052B 15.31 1.257e- 
25 52-78 BL01052D 
10.26 5.737e-25 174- 
194 


800 


BL00348 


p53 tumor antigen 
proteins. 


BL00348F 23.19 3 . 714e- 
09 197-240 


801 


BL00309 


Vertebrate galactoside- 
binding lectin proteins. 


BL00309C 18.65 1 . 621e- 
09 62-87 


802 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5 . 224e- 
09 187-199 


804 


PF00774 


Dihydropyridine 
sensitive L-type calcium 
channel (Beta subuni. 


PF00774A 16.47 8.457e- 
10 110-156 


808 


PR00667 


RETINAL PIGMENT 
EPITHELIUM -RETINAL GPCR 
SIGNATURE 


PR00667C 11-71 9.875e- 
09 12-28 


810 


PD02346 


PHOTOSYSTEM II PROTEIN 
PRECURSOR 


PD02346F 12.89 4.340e- 
09 317-354 
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PHOTOSYNTHESIS . 




811 


BL00685 


CBF-A/NF-YB subunit 
proteins . 


BL00685B 14.41 6 . 779e- 
14 54-95 BL00685A 
11.22 4.798e-l3 5-54 


812 


PR0008O 


ALCOHOL DEHYDROGENASE 
j SUPERFAMILY SIGNATURE 


PR00080A 9.32 9.419e- 
10 93-105 


813 


BL00357 


Histone H2B proteins. 


BL00357 7.74 l-908e-17 
22-65 


815 


PD000S6 


PROTEIN ZINC-FINGER 
METAL- BIND I . 


PD00066 13.92 7.923e- 
15 158-171 PD00066 
13.92 5.200e-14 46-59 
PD00066 13.92 7.000e- 
14 18-31 PD00066 
13.92 7.000e-l3 130- 
143 PD00066 13.92 
7.500e-13 214-227 
PD00066 13.92 9.000e- 
13 102-115 PD00066 
13.92 4.429e-l2 186- 
199 PD00066 13.92 
1 .783e-ll 74-87 


816 


BL01195 


Peptidyl-tRNA hydrolase 
proteins. 


BL01195C 20.12 3.348e- 
20 100-139 


820 


BLC0520 


Interleukin-10 family 
proteins . 


BL00520A 6.21 6.471e- 
09 1-14 


822 


BL00972 


Ubiquitin carboxyl- 
tenninal hydrolases 
family 2 proteins. 


BL00972A 11.93 8.113e- 
09 224-242 


825 


PRO 0 8 76 


NEMATODE METALLOTH IONE IN 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-115 


829 


PD02 8 55 


FLAVOPROTEIN PROTEIN 
DNA/PANTOTHEN. 


PD0285SA 18.37 4.732c- 
28 88-124 PD02855B 
8.36 6.478e-09 132-142 


830 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 l.O00e-13 65-87 
PR0O4O5A 17.71 7.283e- 
13 25-45 


831 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PR00019B 
11.36 1.720e-09 136- 
150 PR00019B 11.36 
3.880e-09 44-58 


832 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13.08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-16 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
PROOOllC 24.25 5.415e- 
12 231-260 PROOOllD 
14.03 9.852e-ll 212- 
231 


834 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD003O6 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4 . OOOe- 
10 290-304 


836 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM0 021 5 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 3.898e- 
09 78-111 


839 


PD02784 


PROTEIN NUCLEAR 
R I BONUCLEOPROTE IN . 


PD02784B 26.46 8.302e- 
09 73-116 


840 


PRO 0700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 5.765e-21 491- 
510 PR00700C 13.17 
4.750e-14 449-467 
PR00700F 11.18 8.500e- 
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11 538-549 PR00700E 
17.57 3.100e-10 522- 
538 


841 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 5„404e- 
13 134-153 


844 


PDG2785 


PROTEIN RIBOSOMAIi 60S 
L22 RNA-BINDING HEP. 


PD02785B 14.43 1 . OOOe- 
40 58-112 PD02785A 
15.23 1.915e-28 8-57 


845 


BLC0826 


MARCKS family proteins. 


BL00826C 7.63 6.738e- 
09 203-230 


846 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.429e- 
10 15-24 


849 


BL00518 


Zinc finger, C3HC4 type 
(RING finger) , proteins . 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PRO03O8A 5.90 6.506e- 
09 12-27 


851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.000e- 
16 246-280 


852 


BL00420 


Speract receptor repeat 
proteins domain 
proteins. 


BL00420B 22.67 l.OOOe- 
40 723-778 BL00420B 
22.67 1.321e-38 933- 
988 BL00420B 22.67 
8.457e-28 482-537 
BL0O42OB 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9-625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22. G7 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 Dlj\}\}%ZUt> £Z - b / 
2.800e-15 830-885 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 S.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 756-811 BL00420B 
22.67 1.321e-38 966- 
1021 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 863-918 
BL00420C 11.90 1.900e- 
13 3S5-366 BL00420C 
11.90 1.900e-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 S.ll9e-ll 1051- 
1062 BL00420C 11.90 
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7.955e-10 567-578 


857 


PR00388 


3 * , 5 ' -CYCLIC NUCLEOTIDE 
CLASS II 

PHOSPHODIESTERASE 
SIGNATURE 


PR00388A 10.45 2.778e- 
09 64-83 


859 


BL0 0030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 2.929e- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00030A 14.39 2.000e- 
10 128-147 


861 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4 . 250e- 
17 23-41 PR00988C 
13.64 8.714e-16 107- 
123 PR00988F 12.23 
7.828e-15 198-212 
PR00988E 8.27 9.769e- 
12 176-188 PR00988D 
5.95 8.2S0e-ll 163-174 
PR00988B 11.60 4 . 512e- 
10 60-72 


863 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215B 10.44 8.071e- 
12 41-54 


864 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E 8.06 1 . OOOe- 
24 198-221 PR00775B 
3.52 1.837e-23 107-130 
PR00775D 8.91 4.484e- 
17 171-189 PRO0775A 
9.90 8.342e-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 PR00775G 
10.64 6.850e-15 267- 
286 PR00775F 12.76 
6.769e-14 249-267 


866 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688G 16.45 9.460e- 
09 B9-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.596e- 
29 14-53 


868 


BL01287 


RNA 3 '-terminal 
phosphate cyclase 
proteins . 


BL01287A 17.95 2.688e- 
26 16-48 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.464e- 
10 304-337 


872 


BL00046 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 


874 




Biotin-requiring enzymes 
attachment site 
proteins . 


BL00188 30.29 9.036e- 
32 665-711 


876 


DT.nn ci*> ft 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 298-315 


877 




SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL . 


PD02102A 16.74 4.176e- 
10 97-141 


879 


BL01189 


Ribosomal protein S12e 
proteins. 


BL01189A 14.27 l.OOOe- 
40 35-71 BL01189B 
13.49 l.OOOe-40 71-125 


882 


BL00284 


Serpins proteins. 


BL00284C 28.56 6.400e- 

O *S — XU1 oLUUzt)4h 

17.99 6.182e-12 35-56 


889 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.375e- 
21 35-85 


896 
897 


PR00391 
PR00327 


PHOS PHATIDYLINOSITOL 
TRANSFER PROTEIN 
SIGNATURE 

ICE NUCLEATION PROTEIN 


PR00391E 12.50 7.785e- 
15 211-231 PR00391B 
8.39 1.000e-13 83-104 
PR00391D 12.21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 
PR0O327C 6.3 7 57247e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


09 313-328 


898 


BL0O039 


DEAD -box subfamily ATP- 
dependent he li cases 
proteins. 


BL00039D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.19 
1.947e-13 153-179 
BL00039C 15.63 9.460e- 
11 236-260 


901 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BINDI . 


PD00066 13.92 8.200e- 
16 254-267 PD00066 
13.92 8.200e-I6 282- 
295 PD00066 13.92 
8.200e-l6 310-323 
PD00066 13.92 8.200e- 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 
8.20C)e-14 338-351 


902 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 9.321e- 
11 6-50 


903 


PR00806 


VINCULIN SIGNATURE 


PROOB06B 4.28 9.l60e- 
09 97-111 


904 


PR00381 

- 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR00381E 8.75 6.586e- 
25 335-356 PR00381B 
18.17 2.667e-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1.084e-22 291- 
309 PR00381F 9.13 
3.288e-22 370-392 
PR00381F 9.13 7.l8le- 
13 286-308 PR00381E 
8.75 4.066e-ll 251-272 
PR00381E 8.75 7.033e- 
11 293-314 PR00381E 
8.75 8.364e-10 377-398 
PR00381D 13.94 5.230e- 
09 333-351 PR00381C 
12.48 7.120e-09 310- 
329 


906 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 8.557e- 
09 525-549 


907 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 8.557e- 
09 513-537 


908 


BL0067B 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 9.308e-ll 
144-155 


910 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.800e- 
30 48-87 


912 


BL01104 


Ribosomal protein L13e 
proteins . 


BL01104C 15.14 6.000e- 
09 364-392 


922 


3L00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3.842e-09 
500-511 


923 


PR00320 


G- PROTEIN BETA WD-4 0 
REPEAT SIGNATURE 


PR00320C 13.01 2.500e- 
09 323-338 PR00320C 
13.01 5.500e-09 187- 
202 


924 


PD02181 


PROTOCHLOROPHYLLIDB 
REDUCTASE PH0T0SYNT. 


PD02181D 12.85 8.609e- 
09 36-54 


926 


BL00019 


Actinin- type actin- 
binding domain proteins. 


BL00019C 14.66 7.453e- 
25 108-144 BL00019B 
13.34 6.510C-11 61-84 
BL00019D 15.33 9.338e- 
11 205-23S BL00019A 
12.56 2.373e-l0 34-45 


928 


BLO0678 


Trp-Asp (WD) repeat 


BL00678 9.67 9.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


273-284 BL00678 9.67 
1.600e-10 314-325 
BL00678 9.67 7.600e-10 
360-371 BL00578 9.67 
8.579e-09 206-217 


929 


BL00518 


Zinc finger, C3HC4 type 
(RING finger}, proteins. 


BL00518 12.23 1.857e- 
10 137-146 


930 


BL01085 


Ribulose-phosphate 3- 
epimerase family 
proteins . 


BL01085D 16.55 4.600e- 
24 134-165 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 172-202 BL01085C 
21.81 2.038e-14 66-97 


931 


BL010B5 


Ribulose-phosphate 3- 
epimerase family 
proteins. 


BLC1085D 16.55 4.600e- 
24 152-183 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 190-220 BL01085C 
21.81 2.038e-14 66-97 


933 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI . 


PD00301A 10.24 6.400e- 
09 160-171 


936 


PF00168 


C2 domain proteins . 


PF00168C 27.49 4.000e- 
12 336-362 


93 7 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.5l9e- 
10 5-49 


94 0 


PR00862 


PROLYL OLIGOPEPTIDASE 
SERINE PROTEASE (S9A) 
SIGNATURE 


PR00862D 16.17 4.086e- 
09 63-84 


94 5 


BL01230 


RNA methyl transferase 
trmA family proteins. 


BL01230B 11.62 2.373e~ 
09 407-420 


94 8 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12.57 7.429e- 
18 52-68 BL00479A 
19.86 2.200e-13 26-49 


949 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 1.474e-09 
100-111 


954 


PD01311 


PROTEIN OXIDORE0UCTASE 
NAD INTERGENIC RE. 


PD01311A 30.23 5.909e- 
10 66-111 


955 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


956 


PF00651 


BTB (also known as BR- 
C/Ttk> domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


"957 


BL00379 


CDP- alcohol 

phosphatidyl transferases 
proteins . 


BL00379 24.64 1.6l0e- 
15 111-148 


959 


BIiOlllS 


GTP- binding nuclear i 
protein ran proteins. 


BL01115A 10.22 1.884e- 
10 31-75 


960 


BL01115 


GTP - binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.438e- 
14 110-154 


952 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins. 


BL00061B 25.79 6.586e- 
13 198-236 


"963 


PR00502 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8 . 200e- 
11 210-225 


966 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 7.035e- 
09 55-70 


967 


DM0 12 06 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 1.286e- 
12 104-124 DM01206B 
10.69 5.299e-ll 23-43 
DM012O6B 10.69 8 . 274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10.69 
5.671e-09 38-58 


969 


PF01008 


initiation factor 2 
subunit . 


PF01008B 25.59 4.724e- 
31 417-460 PF01O08C 
12.25 5.333e-18 506- 
526 PF01008A 20.14 
5.875e-15 369-390 
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SEQ ID NO: 


ACCESS ION 
NO. 


DESCRIPTION 


RESULTS* 


970 


BL01277 


Ribonuclease PH 
proteins . 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-10 40-78 


975 


BIj01159 


WW/rspS/WWP domain, 
proteins . 


BL01159 13.85 3.605e- 
12 130-145 BL01159 
13.85 4 .122e-10 171- 
186 


977 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791C 20.98 2.235e- 
09 55-94 


978 


BL01167 j 


Ribosomal protein L17 
proteins . 


BL01167B 20.66 8.258e- 
19 88-127 


979 


BL00478 


LIM domain proteins. 


BL00478B 14.79 9.357e- 
13 33-48 BL00478B 
14.79 7.250e-12 98-113 


980 


PR00312 


CALSEQUESTRIN SIGNATURE 


PR00312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 5.286e-35 332- 
361 PR00312F 15.06 
5.865e-35 199-229 
PR00312H 13.31 8.313e- 
35 263-291 PR00312J 
13.73 5.688e-34 363- 
392 PR00312D 9.43 
2.636e-33 128-158 
PR00312C 15.14 8.839e- 
33 92-122 PR00312B 
15.08 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR00312A 
11.70 6.9l4e-27 35-59 


981 


PF00992 


Troponin . 


PF00992A 16.67 8.816e- 
09 414-449 


982 


PRO 02 9 9 


ALPHA CRYSTALLIN 
SIGNATURE 


PR00299F 13.20 2.367e- 
09 127-149 


983 


BL01150 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A 
14.10 8.200e-39 100- 
138 


986 


BL00795 


Involucrin proteins. 


BL00795C 17.06 7.211e- 
14 4-49 BL00795C 
17.06 1.778e-ll 1-46 
BL00795C 17.06 3.407e- 
10 14-59 BL00795C 
17.06 7.802e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
17.06 7.400e-09 11-56 
BL00795C 17.06 7.800e- 
09 3-48 


987 


3L00939 


Ribosomal protein Lie 
proteins . 


BL00939F 17.27 5.393e- 
09 810-840 


988 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 525-541 


989 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6. 53 Be- 
ll 497-513 


994 


BL00027 


1 Horaeobox 1 doma in 
proteins. 


BL00027 26.43 2.500e- 
25 146-189 


997 


BL01304 


ubiH/COQ6 monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN. 


DM01767B 10.07 7.868e- 
09 22-39 


1000 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 1.750e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
14S PR00926F 17.75 
6.211e-23 217-240 
PR00926E 11.70 6.625e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 174-193 PR00926B 
16.07 2.12Se-18 24-39 
PR00926A 10.41 l.OOCe- 
lb 1J.-25 PR00926F 
17.75 5.565e-09 120- 
143 


1005 


BL00406 


Actins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
rJJ->UU<3UbU 12.bo 3.700e- 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


BL00406 


Act ins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9.95 3.348e-29 11-46 


100.7 


rn.uu jui 


lAi.JbL1fc.i3S COMPLEX 
\ POLYPEPTIDE 1 
I (CHAPERONE) SIGNATURE 


PR00304D 11.04 8.714e- 
22 384-407 PR00304C 
8.69 4.667e-20 98-118 
PR0O3C4B 11.60 7.577e- 
19 68-87 PR00304A 
9.20 3.382e-16 46-63 
PR00304E 7.79 6.B70e- 
13 418-431 


1009 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.929e- 
32 9-48 


1011 


v jl v a o 


rKVJiblN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.929e- 
32 68-107 


1012 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 6.143e- 
10 64-73 


1016 


PD01168 


PROTEIN ALANYL. 


PD01168H 12.08 1.000a- 
11 174-194 


1013 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION . 


PD00930B 33.72 1.391e- 
32 261-302 PD00930A 
25.62 9.550e-22 157- 
183 


1022 


BL00175 


Pnospnoglycerate niutase 
family phosphohistidine 
proteins . 


BL00175A 15.42 5.179e- 

12 6-26 BL00175C 

■4 J « '-5 O.UD4C-1U /9-j.n 


1025 


PRO 03 05 


14-3-3 PROTEIN ZETA 
SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 


1026 


3L00353 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- 
j.o zjo-JoB OL00353C 
14.83 8.844e-ll 288- 
335 


1028 


BL00183 


Ubicnii tin- conn iifrat* i rirr 

enzymes proteins. 


33 43-91 


1033 


PF00580 


UvrD/REP helicase. 


09 111-133 


1034 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY 
SIGNATURE 


JfKUU4UJi lb. to 3.429e- 
09 154-171 


1037 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.657e- 
09 5-44 


1038 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 4.259e- 
11 55-82 


103 9 " """ 


BL00299 ' ■ 


Ubiquitin domain 
proteins . 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PR0O970 


ARGININE ADP- 

RI BOS YLTRANSFERASE 


PK00970A 17.73 6.l43e- 
20 56-78 PR00970D 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


9.96 2.l54e-l8 154-171 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 

9.97 9.229e.-1.5 242-258 
PR00970B 16.37 1.290e- 
13 86-105 PR00970C 
11. OS 1.643e-ll 115- 
130 PR00970E 11.23 
9.820e-ll 202-218 j 


1042 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
243-254 


1043 


PR0004 8 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 6.786e- 
13 114-128 PR00048A 
10.52 1.000e-09 172- 
186 


1045 


BL00615 


C-type lectin domain 
proteins. 


BL00615A 16.68 1.720e- 
11 218-236 BL00615B 
12.25 1.857e-10 317- 
331 


1046 


BL01092 


Adenylate cyclases 
class- I proteins. 


BL01092N 13.54 8.924e- 
10 3-40 


104 7 


BI»01216 


ATP-citrate lyase / 
succinyl-CoA ligases 
family proteins . 


BL01216D 21.75 4.316e- 
28 314-344 BL01216A 
13.91 1.000e-10 97-112 


1049 


DM00031 


IMMUNOGLOBULIN V REGION. 


• DM00031B 15.41 7.618e- 
12 102-136 


1050 


BIi01073 


Ribosomal protein L»24e 
proteins . 


BL01073 24.30 1 . OOOe- 
40 12-62 


1054 


BL00571 


Amidases proteins. 


BL00571 25.69 5.875e- 
31 160-212 


1055 


BL00030 


EuJtaryotic RNA-binding 
region RNP-1 proteins . 


BL00030A 14.39 5.235e- 
11 98-117 BL00030B 
7.03 4.316e-09 137-147 


1058 


BL00223 


Annexins repeat proteins 
domain proteins . 


BL00223C 24.79 8.754e- 
23 262-317 BL00223A 
15.59 9.478e-14 46-80 
BL00223A 15.59 5.557e- 
11 118-152 


1060 


BL00027 


» Homeobox ' domain 
proteins. 


BL00027 26.43 3.455e- 
35 158-201 


1064 


BL00455 


Putative AMP-binding 
domain proteins . 


BL00455 13.31 6 . 211e- 
13 280-296 


1065 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.000e- 
09 115-129 PR00019B 
11.36 3.8S0e-09 87-101 


1066 


PR00326 


GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 4.600e- 
16 151-172 PR00326C 
9.79 1 .290e-14" 200-216 
PR00326B 16.74 8.548e- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
236 


1071 


PD02870 


RECEPTOR INTERLEUKIN- 1 
PRECURSOR . 


PD02870B 18.83 8.518e- 
11 164-197 


1072 


PF00856 


SET domain proteins. 


PF00056A 26.14 5.976e- 
09 350-387 


1075 


BL01009 


Extracellular proteins 
SCP/Tpx- 1 /Ag5/PR- 1/Sc7 
proteins. 


BL01009D 14.19 4.300e- 
20 127-148 BL01009A 
13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-175 


1077 


PR00724 


CARBOXYPEPTIDASE C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 


1078 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 l.OOOe- 
12 170-195 BL00215A 
15.82 7.529e-10 79-104 


1079 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 4.316e-09 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 






proteins proteins. 


298-309 


1081 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 7.398e- 
10 23-57 


1094 


BI.00460 


Glutathione peroxidases 
selenocysteine proteins . 


BL00460A 28.67 3.204e- 
18 57-92 BL00460B 
9.73 6.400e-13 100-118 
BL00460D 16.89 9.143e- 
12 162-182 BL00460C 
14.35 5.5D0e-09 133- 
156 


1095 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 67-105 PD02811B 
17.07 2.263e-21 118- 
151 PD02811C 13.25 
5.696e-13 154-167 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 60-98 PD02811B 
17.07 2.263e-21 111- 
144 PD02811C 13 .25 
5.696e-13 147-160 


1097 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.143e- 
09 200-216 


1105 


"PF00881 


Nitroreductase family. 


PF00881A 27.15 9.229e :: 
13 111-147 


1109 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.077e- 
10 15-37 PR00449E 
13.50 1.857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


1115 


PRO 0405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.737e- 
20 42-60 PROG405A 
17.71 2.703e-17 23-43 
PR00405C 19.41 6.902e- 
10 53-85 


1116 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.528e-25 
20-51 


1117 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.528e-25 
20-51 


1120 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 4.8S7e- 
10 290-306 


1123 


PR0O412 


EPOXIDE HYDROLASE 
SIGNATURE 


PR00412F 18.76 9.526e- 
12 301-324 


1125 


PR001B6 


HEMERYTHRIN SIGNATURE 


PR001B6A 13.62 2.800e- j 
09 87-101 | 


1129 


BL00170 


Cyclophilin- type 
pep t idyl -prolyl cis- 
trans isomerase 
signatur. 


BL00170C 18.49 3.077e- 
33 84-129 BL00170B 
20.97 6.838e-25 37-77 
BL00170A 17.08 3.455e- 
15 10-37 


1131 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B \ 
15.11 1.360e-14 59-80 


1132 


obUUb /o 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1133 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1136 


BL00990 


UAdLiiriu adaptor 

complexes medium chain 
proteins. 


BL00990C 18.78 4.176e- 
38 235-269 BL00990A 
21.44 4.316e-36 94-132 
BL00990B 20.15 2.125e- 
27 157-187 BLO0990D 
16.13 5.320e-18 403- 
422 


1137 


PRO 03 14 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 S.OOOe- ] 
34 100-128 PR00314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e- 
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32 159-188 PR00314A 
14.53 1.281e-22 13-34 


1139 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 6.364e- 
13 13-57 


1141 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4 . OOGe- 
19 451-482 BL00107B 
13.31 3.077e-l2 519- 
535 


1148 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR I IB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


1155 


PD01652 


RECEPTOR CELL NK 
GLYCOPROTEIN IMMUNOGLOB . 


PD01652B 8.50 9.396e- 
10 522-574 PD01652B 
8.50 9.463e-10 740-792 


1157 


PD02 894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894A 21.96 7.873e- 
28 81-127 PD02894B 
13.93 1.188e-27 178- 
211 


1159 


BL00623 


GMC oxidoreductases 
proteins . 


BL00623E 15.00 3.531e- 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 


1161 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- 
09 330-341 


1162 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- 
09 221-232 


1163 


PR00624 


HI STONE H5 SIGNATURE 


PR00624D 11.94 7.455e- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 


1167 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 7.384e- 
09 302-350 


1177 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032G 8.33 1.422e- 
10 34-48 


1178 


PR00320 


G- PROTEIN BETA KD-4 0 
REPEAT SIGNATURE 


PR00320A 16.74 1.794e- 
10 205-220 PR0032OC 
13.01 7.840e-10 205- 
220 PR00320B 12.19 . 
8.457e-10 35-50 
PR00320A 16.74 7.146e~ 
09 35-50 PR00320B 
12.19 9.100e-09 79-94 


1180 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454D 10.89 4 . 150e- 
19 765-784 


1181 


BL00291 


Prion protein. 


BL00291A 4.49 8.962e- 
11 152-187 


1184 


BL00720 


Guanine -nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 4.103e- 
18 1089-1113 


1185 


BLC0215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 4.553e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BLO0983 


Jby-6 / u-PAR domain 
proteins . 


BL00983C 12.69 2.761e- 
10 77-93 


1188 


BL00878 


Orn/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 
si. 


BL00878B 10.95 6.000e- 
16 189-204 BL00878C 
17.74 8.435e-15 225- 
245 BL00878F 19.67 
3.62Se-13 379-402 
BL0087BD 16.56 1.621e- 
09 270-289 


1191 


PD02939 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


PD02939B 10.10 2.723e- 
12 203-220 PD02939C 
20.01 1.000e-ll 224- 
252 


1193 


PRO0345 


STATHMIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
28 72-101 PR00345B 
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8.54 7.652e-28 149-174 
PR00345C 4.54 9.100e- 
28 101-125 PR00345D 
10.97 1.964e-24 125- 
149 PR00345A 13.46 
S.645e-16 43-62 


1194 


PR00345 


STATHMIN FAMILY 


PR00345B 7.12 2.800e- 
28 108-137 PR00345E 
8.54 7.652e-28 185-210 
PR00345C 4.54 9.100e- 
28 137-161 PR00345D 
10.97 1.964S-24 161- 
185 PR00345A 13.46 
5.645e-16 79-98 


1195 


PF00995 


i^eci ramily. 


PF00995B 17.37 1.120e- 
13 224-264 


1196 


BL00932 


Bacterial -type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 6.738e- 
11 15-47 


1197 


BL01298 


Dihydrodipicolinate 
reductase proteins. 


BL01298A 13.90 5.959c- 
09 51-73 


1203 


BL00061 


Short-chain 

dehydrogenases /reductase 
s family proteins. 


BL00061B 25.79 l.OOOe- 
14 152-190 


1204 


PR00118 


BETA- LACTAMASE CLASS A 
SIGNATURE 


PR0011BF 16.42 9.386e- 
09 213-229 


1206 


BL011B3 


ut>iE/COQ5 

methyl transferase family 
proteins . 


BL01183B 21.31 1.429e- 
37 184-229 BL01183D 
27.71 8.535e-27 264- 
307 BL01183A 13.25 
3.250e-23 51-73 
BL01183C 10.77 5.295e- 
09 246-258 


1208 


BL00979 


G-protein coupled 
receptors family 3 
proteins . 


BL00979L 20.63 2.485e- 
09 105-146 


1209 


PFC0023 


Ank repeat proteins. 


PF00023A 16.03 4.8S7e- 
11 49-65 PF00023B 
14.20 1.8l8e-09 45-55 


1212 


PRO 0 04 8 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7 .750e- 
14 227-241 PR00048A 
10.52 4.316e-ll 199- 
213 


1213 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720e- 
10 20-42 PR00450C 
12.22 3.506e-09 56-78 
PR00450D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 5.598e- 
10 179-230 


1219 


rttULfi i>o 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 5.348e- 
11 249-264 


1222 


ruuu Uob 


PROTEIN ZINC- FINGER 
METAL- BINDI. 


PD00066 13.92 7.231e- 
15 295-308 PD00066 
13.92 7.231e-15 406- 
419 PD00066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.8S7e- 
12 434-447 PD00066 
13.92 3.348e-ll 350- 
363 


1223 


BL50058 


G-protein gamma subunit 
profile. 


BL50058 27.23 l.OOOe- 
40 13-61 


122* 


BL00412 


Neuromodulin (GAP-43 ) 
proteins. 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16.28 1.000e-40 114- 
168 BL00437C 21.86 
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1.000e-40 190-239 
BL00437D 25.72 l.OOOe- 
40 248-301 BL00437E 
23.95 1.000e-40 327- 
379 


1230 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 8.297e- 
10 5-60 


1231 


PR00735 


GLYCOSYL HYDROLASE 
FAMILY 8 SIGNATURE 


PR00735A 11.19 6.857e- 
09 391-405 


1232 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 S.553e- 
10 158-176 


1233 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1235 


BL00866 


Ca rbamoyl - phospha t e 
synthase subdomain 
proteins . 


BL00866B 36.29 2.776e- 
09 75-121 


1237 


BL00027 


• Homeobox 1 domain 
proteins . 


BL00027 26.43 1.81Be- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 1.184e- 
11 10-25 


1246 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168L 9.47 2.837e- 
10 31-46 PD01168L 
9.47 4.490e-10 174-189 
PD01168L 9.47 7.612e- 
10 183-198 


1249 


BL00018 


EF-hand calcium-binding 
doma i n pr ot e i ns . 


BL00018 7.41 2.800e-10 
183-196 


1254 


BL00183 


Ubiqui tin- conjugating 
enzymes proteins . 


BL00183 28.97 2.440e- 
36 96-144 


1255 


BL01115 


GTP -binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670s- 
11 8-52 


1256 


BL003 73 


Phosphor ibosylglycinamid 
e formyl transferase 
proteins . 


BL00373C 10.35 3.348e- 
12 143-156 


1258 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13.08 3.217e- 
10 174-193 


1259 


BL00518 


Zinc finger, C3IIC4 type 
(RING finger), proteins. 


BL00518 12.23 8.2B6e- 
10 31-40 


1261 


PR00070 


D I H YDROFOLATE REDUCTASE 
SIGNATURE 


PR00070D 11.63 l.OOOe- 
15 112-127 PR00070C 
13.09 9.500e-15 51-63 
PR00070A 12.92 5.500e- 
12 16-27 


1262 


BL00462 


Gamma - 

glutamyl transpeptidase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17.88 5.500e-20 230- 
267 BL00462C 27.41 
2.023e-ll 292-347 


1263 


BL00038 


Myc-typc, 'helix- loop- 
helix* dimerization 
domain proteins . 


BL00038B 16.97 9.455e- 
11 62-83 


1264 


BL0111S 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 17-61 


1266 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837C 17.21 2.714e- 
18 165-182 PR00837A 
14.77 4.512e-12 86-105 
PR00837D 11.12 7.577e- 
12 201-215 


1269 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9-308e- 
22 40-63 PR00449E 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 


1270 


BL00276 


Channel forming colicins 
proteins . 


BL00276A 8.87 1.500e- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327C 15.47 9-769e- 
09 228-243 


1276 


PR00412 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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SIGNATURE 


12 119>135 PR00412C 
11.30 1.857e-ll 165- 
179 PR00412A 13.23 
3.400e-ll 100-119 


1277 


PF00756 


Putative esterase. 


PF00756C 14.12 9.53Be- 
10 127-157 


1279 


BL00134 


trypsin family, 

iij.ouj.uj.iic ir*- *-»*-*s J.ncJ . 


BL00134A 11.96 9.325e- 
13 128-145 


1280 


BL01220 


Phosphatidylethanol amine 
proteins . 


BL01220C 14.75 9.348e- 
lo Z48-276 


1285 


BL00518 


c+jLiiK* i. xiiyci , type 

(RING finger), proteins. 


BL00518 12.23 2.286e~ 
10 33-42 


1287 


PF00791 


Domain present in 20-1 

Call Li UIluD ■ _L 1A.C nCCilJl 

receptors . 


PF00791B 28.49 7.182e- 
11 288-343 


1292 


PR00802 


otKun iujDUi'liN JpAMJLIjX 

SIGNATURE 


PR00802B 16.51 1.610e- 
10 81-105 


1297 


PR00716 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.696e- 
09 23-44 


1298 


BL00478 


LIM. domain proteins. 


BL00478B 14.79 6.478e- 
14 268-283 


1301 


BL00127 


Pancreatic ribonuclease 
family proteins. 


BL00127C 31.49 3 .571e- 
28 82-126 BL00127B 
26.57 8.800e-28 23-68 


1302 




TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4.250e- 
09 290-306 


1307 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 5.500e- 
17 13-38 BL00215A 
15.82 I.000e-16 226- 
251 BL00215A 15.82 
2.658e-13 107-132 


1308 


frvv Uu jo 


VASOPRESSIN V2 RECEPTOR 
SIGNATURE 


PR00898H 11.34 4.682e- 
09 S52-S72 


1309 


PDQ03 01 


PROTEIN REPEAT MUSCLE 
CALCIUM- B I . 


PD00301B 5.49 2.731e- 
09 390-401 


1310 


BL00983 


Ly-6 / u-PAR domain 
proteins. 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


1313 


BL00194 


Thioredoxin family 
proteins . 


BL00194 12.16 1.900e- 
11 15-28 


1314 


BL00594 


Aromatic amino acids 
permeases proteins . 


BL00594A 16.75 8.969e- 
10 53-97 


1316 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1320 


BL00783 


Ribosotnal protein LI 3 
procems . 


BL00783C 22.43 6.559e- 
24 87-117 BL00783A 
14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 
12 74-86 


1327 


PF00514 


Armadillo/beta -cat enin- 
like repeat proteins. 


PF00514A 31.30 7.268e- 
11 82-120 


1329 


BL0003 0 


liUKaryocxc KNA-nxndxng 
region RNP-1 proteins. 


BL00030A 14.39 6.294e- 
11 129-148 BL00030B 
7.03 4.789e-09 168-176 


1331 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P4 0 SIGNATURE 


PR004 97A 6.92 7.. 239e- 
09 25-43 


1332 


PR00161 


NICKEL - DEPENDENT 
H YDROGENASE/B - TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e- 
09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.769e- 
33 10-49 


1336 
^133 7 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 2.200e- 
09 262-281 




PR00700 


PROTEIN TYROSINE 


FRO07O0D 12.47 2.200e- 
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PHOSPHATASE SIGNATURE 


09 211-230 


1340 


PR00860 


VERTEBRATE 
METALLOTH I ONE IN 
SIGNATURE 


PR00860A 5.46 5.034e- 
13 5-18 


1341 


n T f \ f \ ft rt *» 

ii.u00 893 


mutT domain proteins. 


BL00893 18.99 6.750e- 
16 46-71 


1343 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 S.974e- 
21 383-422 


1344 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE . 


DM00099B 14.73 8.313e- 
09 417-427 


1345 


BL00923 


Aspartate and glutamate 
racemases proteins. 


BL00923B 11.41 5.935e- 
10 135-146 


1348 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins - 


PF00651 15.00 7.231e- 
13 44-57 


1350 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 3.571e- 
32 416-445 PR00193C 
12.60 6.318e-31 179- 
207 PR00193B 11.69 
3.571e-24 133-159 
PR00193E 19.47 9.069e- 
22 470-499 PR00193A 
15.41 1.783e-20 77-97 


1352 


PR00447 


NATURAL RESISTANCE- 
ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


PR00447E 9.73 1.554e- 
15 299-319 PR00447D 
13.54 3.408e-15 200- 
224 PR00447A 12.73 
6.357e-ll 97-124 
PR00447G 6.69 9.877e- 
10 353-373 


1353 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303A 21.77 6.667e- 
26 45-82 BL00303B 
26.15 1.000e-24 93-130 


1355 


BL00039 


DEAD- box subfamily ATP- 
de pendent heli cases 
proteins. 


BL0003 9D 21.67 5.950e- 
29 375-421 BL00039A 
18.44 7.136e-29 99-138 
BL00039C 15.63 4 . OOOe- 
18 225-249 BL00039B 
19.19 3.182e-14 141- 
167 


1357 


PF00615 


Regulator of G protein 
signalling domain 
proteins . 


PF00615B 16.25 2.216e- 
12 B4-101 PF00615C 
10.06 8.412e-12 162- 
176 


1360 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.234e- 
29 10-49 


1361 


PRO 09 25 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925A 5.47 5.091e- 
18 14-29 PR00925B 
3.73 6.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 
6.56 1.857e-10 76-87 


1362 


BLQ1272 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e- 
30 136-171 BL01272C 
11.68 3.314e-25 249- 
274 BL01272A 6.49 


1363 


BL01272 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e- 
30 113-148 BL01272C 
11.68 3.314e-25 226- 
251 BL01272A 6.49 
1.23le-18 76-94 


1364 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 
09 167-177 


1368 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.592e- 
09 76-96 


1370 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 1.794e- 
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10 1-19 


1371 


BL00242 "' 


Aiiueytins aipna cnain 
proteins . 


BL00242B 8.13 8.615e- 
09 469-479 


1372 


PR0062S 


DNAJ PROTEIN FAMILY 
STGNAT7TPF 


PR00625B 13.48 7.353e- 
19 46-67 PR00625A 
12.84 1.39!e-16 14-34 


1373 


BL00434 


domain proteins . 


BL00434C 23.85 3.770e- 
09 90-130 


1374 


PRO0962 


PROTEIN SIGNATURE 


PR00952C 8.00 6.337e- 
09 505-526 


1375 


PD02475 


riu^j.lM c-PiTHELIAL TUMOR- 
ASSOCIATE . 


PD02475A 23.18 8.552e- 
10 1111-1150 


1376 


PD01066 


itku I r. x jn J? J.NGER 
ZINC- FINGER METAL- 

RTMHTMP vtt 7 


PD01066 19.43 9.571e- 
32 24-63 


1380 


BL00194 


Thioredoxin family 
proteins . 


BLC0194 12.16 8.333e- 
12 48-61 


1381 


DM01970 


0 Jew ZK632.12 YDR313C 
ENDOSOMAL III . 


DM01970B 8.60 1.458e- 
15 1123-1136 


1383 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
243-254 


1384 


DJjU UO / g 


rrp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
271-282 


1385 


BL0O303 


S-100/ICaBP type calcium 
binding protein . 


BL00303B 26.15 6.203e- 
10 95-132 


1386 


BL01160 


Kmesin light chain 
repeat proteins . 


BL01160B 19.54 5.042e- 
09 1574-1628 


1387 


BL00518 


Zinc finger, C3HC4 type 
{RING finger), proteins. 


BL00518 12.23 1 . OOOe- 
11 52-61 


1389 


PD01O66 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 3.600e- 
30 10-49 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 3.512e- 
31 32-71 


1392 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 9.723e- 
10 127-137 


1393 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9 . 62Se- 
25 88-110 PR00380D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C 
13:18 6.S38e-16 243- 
262 


1394 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BINDI . 


PD00066 13.92 3.400e- 
14 462-475 PD00066 
13 .92 8.800e-14 348- 
361 PD00066 13.92 
9.571e-12 405-418 ■ 
PD00066 13.92 6.087e- 
11 490-503 PD00066 
13.92 8.043e-ll 320- 
333 


139B 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.786c- 
32 10-49 


1400 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 7.038e- 
09 270-290 


1406 " ' 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930A 25.62 7.324e- 
15 363-389 


1407 


BL0O030 


Eukaryotic RNA- binding 
region RNp-l proteins. 


BL00030A 14.39 7.500e- 
10 457-476 


14 08 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.550e- 
11 179-193 PR00019A 
11.19 8.826e-l0 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR00019B 11.36 4.960e- 
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09 176-190 


1409 


PR00510 


NEBULIN SIGNATURE 


PR00510A 9.09 4.l50e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9.21 2.367e- 
09 251-267 


1410 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.696e- 
09 31-44 


1412 


BL00358 


Ribosomal protein L5 
proteins. 


BL00358B 22.76 1 . OOCe- 
40 57-103 BL00358C 
13.75 6.087e-14 122- 
136 BL00358D 14.26 
5.500e-13 143-158 
BL00358A 13.06 1.93le- 
11 33-44 


1414 


BL00282 


Kazal serine protease 
inhibitors family 
proteins . 


BL00282 16.88 7.338e- 
10 511-534 


1415 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 4.300e- 
29 40-77 


1417 


PR00681 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PR00681G 12.54 2 . 149e- 
09 38-60 


1418 


DMO 09 73 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE . 


DM00973A 21.17 1.462e- 
09 171-208 


1419 


PRO 03 19 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 1.571e- 
09 428-443 


1420 


PD01941 


TRANSMEMBRANE 
COTRANS PORTER SYMP. 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15.02 7.049e-30 400- 
447 PD01941E 15.92 
2.475e-20 817-864 
PD01941C 19.96 3.118e- 
19 488-543 PD01941D 
27.18 9.614e-18 641- 
690 PD01941F 28.52 
5.382e-15 1038-1093 


1422 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 8.043e- 
12 199-217 


1423 


PR00209 


ALPHA/ BETA GLIADIN 
FAMILY SIGNATURE 


PR00209B 4.88 6.318e- 
11 1009-1028 


14 24 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 8.200e- 
14 367-386 BL50002A 
14.19 9.250e-12 298- 
317 BL50002A 14.19 
4.462e-ll 208-227 
BL50002B 15.18 l.OOOe- 
09 244-258 


1425 


PF00628 


PHD-f inger . 


PF00628 15.84 3.045e- 
12 330-345 


1426 


PF00628 


PHD- finger. 


PF00628 15.84 3.045e- 
12 377-392 


1427 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.114e- 
16 281-299 PR00405A 
17-71 4.306e-14 262- 
282 


1428 


BL00039 


DEAD -box subfamily ATP- 
dependent heli cases 
proteins . 


BL00039D 21.67 5.219e- 
34 147-193 


1429 


PR0O320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 


143 0 


PR00378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
186 


1431 


PR00928 


GRAVES DISEASE CARRIER 


PR00928B 13.53 3.769e- 
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PROTEIN SIGNATURE 


10 103-124 


1433 


DIjU J. JL J_ J 


Clq domain proteins. 


BL01113B 18.26 7.049e- 
15 14-50 BL01113C 
13.18 7.000e-12 82-102 


1434 


PR00319 


BETA G- PROTEIN 
(TRANSDUCING QTrN&.TrTO'P 


PR00319B 11.47 7.983e- 
lO 135-150 


1436 


BL00030 


Eukaryotic RNA-binding 
xcyj.un Kivt" x proccins . 


BL00030A 14.39 l.OOOe- 
12 84-103 | 


1438 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.500e- 
09 2S0-268 BL00290A 
20.89 4.000e-09 188- 
211 


1440 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 38-52 


1441 


PR008D6 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 88-102 


1444 


BL00422 


Granins proteins. 


BL00422D 19.48 1 . 000c- 
08 114-138 


1445 


PD01841 

• 


PHOS PHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 1 
40 73-123 PD01841B 
14.35 1.000e-40 144- 1 
185 PD01841D 17.87 
1.000e-40 206-258 
PD01841F 13.36 l.OOOe- 
40 296-345 PD01841G 
24.26 1.000e-40 349- 
403 PD01841I 23.00 
1.000e-40 494-536 
PD01841J 14.94 l.OOOe- 
40 895-932 PD01841L 
18.42 1.000e-40 1083- 
1125 PD01841E 18.60 
9.7l9e-38 258-296 
PD01841K 14.81 l.OOOe- 
3S 1041-1071 PD01841H 
21.30 3.189e-3l 435- 
472 PD01841C 13.78 
1.000e-25 185-206 
PD01841M 10.82 1.250e- 
20 1175-1194 


1446 


PF00816 


H-NS his tone family. 


PF00816B 13.84 8.875e- 
09 190-22O 


1447 


PR00048 


C2H2-TYPE ZINC FINGER 
Q T fTM A TT TO t? 


PR00048A 10.52 2.080e- 
09 402-416 


1448 


DM00315 


072 RIBONUCLEASE 
INHIBITOR . 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BL00030 


Eukaryotic RNA-binding 
region RNP-l proteins. 


BIi00030B 7.03 2.800e- 
10 94-104 


1454 


1 J*M VJ jLOOO 


vuiiY- IG RECEPTOR . 


DM01688D 13.44 7.146a- 
09 382-405 


1455 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 2.929e- 
22 4-59 


1457 


BL00927 


Trehalase proteins. 


BL00927C 10.83 8.085e- 
09 42-53 


1460 


BL00545 


Aldose l-epimerase 
proteins . 


BL00545C 11.28 7.353e- 
17 169-182 BL00545A 
10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PRO 00 97 


ANTHRANILATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9.069e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins . 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 2114-2145 


1475 


PF00686 


Starch binding domain 
proteins . 


PF00686A 13.45 9.100e- 
09 267-277 
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1477 


PF00566 


Probable rabGAP domain 
proteins . 


PFO0566A 12.64 7.333e- 
10 466-476 


1478 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030B 7.03 9.400e- 
10 43-53 


1479 i 


DM00406 


GLIADIN. 


DM004 06 7.73 8.541e-10 
292-3 05 


1480 


BL0 0290 


Immunoglobulins and 
major histocompatibility 
complex proteins . 


BL00290B 13.17 2.385e- 
15 69-87 BL00290A 
20.89 5.091e-ll 12-35 


1481 


PR00150 


PHOS PHOENOIiP YRUVATE 
CARBOXYLASE SIGNATURE 


PR00150F 10.45 9.039e- 
09 21-51 


1482 


PF00780 


Domain found in NIX1- 
like kinases, mouse 
citron and yeast ROM. 


PF00780I 14.69 4.825e- 
09 107-137 


1483 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 1.153e- 
09 108-162 


1485 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.909e- 
25 17-56 


1486 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 34-50 


1488 


BI,0 0039 


DEAD - box subfamily ATP- 
dependent he li cases 
proteins . 


BL00039D 21.67 9.586e- 
10 116-162 


1490 


BL0 0166 


Enoyl-CoA 

hydratase/ i sotnerase 
proteins . 


BL00166D 22.87 2.607e- 
24 190-226 BL001S6C 
18.93 5.500e-14 140- 
167 BL00166B 16.92 
9.357e-ll 93-115 


1491 


BL00452 


Guanylate cyclases 
proteins. 


BL00452D 28.59 3 . 700e- 
31 63-106 DL00452E 
11.92 3 .045e-13 115- 
131 


1492 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019A 11.19 3.667e- 
09 532-546 


1497 


Bl»00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
11 384-400 BL00107A 
18.39 5.345e-ll 322- 
353 


1500 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 
10 107-117 


1502 


BL0O027 


1 Home o box • domain 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1503 


BL.00027 


* Home obox * doma in 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1505 


BL01177 


Anaphylatoxin domain 
proteins . 


BL01177E 20.64 5.800e- 
24 448-475 BL01177C 
17.39 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 1.900e- 
15 427-445 


1506 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins . 


BL00972D 22.55 5.500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL00972E 20.72 8.7S9e- 
10 341-363 


1512 


BL.00523 


Sulfatases proteins. 


BL00523E 19.27 4.536e- 
22 76-106 BL00523D 
9.89 1.563e-ll 40-52 
BL00523F 10.85 4 . 162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BL00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


1518 


BL.00600 


Aminotransferases class - 
III pyridoxal -phosphate 
attachment si. 


BL00600A 17.98 6.143e- 
19 98-122 BL00600E 
16.43 1.771e-17 302- 
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331 BL006C0G 12.43 
9.625e-17 377-396 
BL00600B 19.60 5.091e- 
15 160-186 BL00600C 
16.18 6.040e-l2 190- 
206 BL006COF 8.77 
l.O00e-ll 343-356 
BLO0600D 8.71 l.OOOe- 
10 281-295 


1523 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.600e- 
18 41-82 


1528 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320B 12.19 4.774e- 
11 192-207 PR0O32OB 
12.19 8.839e-ll 272- 
287 PR00320B 12.19 
9.743e-10 106-121 
PR00320A 16.74 1.878e- 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR00320A 16.74 
8.683e-09 272-287 
PR00320C 13.01 8.800e- 
09 106-121 


153 8 


DM01970 


0 kw ZK63 2.12 YDR313C 
ENDOSOMAL III. 


DM01970B 0.60 4.508e- 
15 171-184 


1539 


PF00781 


Diacylglycerol kinase 
catalytic domain 
proteins (presumed) . 


PF007B1D 11.11 7.593e- 
10 103-127 


1540 


PR00965 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 


PR00965H 10.73 1.231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 
195 PR00965F 5.98 
1.123e-28 209-231 
PR0096SC 15.04 l.OOOe- 
27 131-151 PR00965D 
5.84 1.000e-27 150-170 
PR00965G 8.52 2.440e- 
27 258-279 PR00965B 
4.80 8.650e-26 88-109 
PR00965A 12.52 l.OOOe- 
25 35-55 PR00965I 
3.91 6.442e-25 385-406 


1541 


BL01013 


Oxys terol - binding 
protein family proteins. 


BL01013D 26.81 9.719e- 
17 163-207 


1543 


PD02699 


PROTEIN DNA-BINDING 
BINDING DNA. 


PD02699C 24.84 1 . 000e- 
40 599-646 PD02699A 
8.91 2.286e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.857e- 
10 182-197 PR00049D 
0.00 7.102e-09 67-82 


1547 


BL00951 


ER lumen protein 
retaining receptor 
proteins. 


BL00951C 19.35 1.000s- 
40 93-142 BL00951D 
13.94 8.714e-40 142- 
177 BL00951A 15.10 
1.000e-38 2-38 . 
BL00951B 14.23 o.250e- 
33 38-69 


1548 


BL00536 


Ubigu it in- activating 
enzyme proteins. 


BL00536F 13.65 8_920e- 
30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 


1549 


PRO 013 9 


AS PARAGINASE / GLUTAM INASE 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PRO 004 9 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.1l9e- 
09 58-73 
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1556 


BL00061 


Short -chain 

dehydrogenases /reduct ase 
s family proteins . 


BL00061B 25.79 6 . 276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8 - l05e- 
12 107-132 


1558 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8 . 105e- 
12 107-132 


1559 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.l05e- 
12 107-132 


1562 


BL00522 ; 


DNA polymerase family X 
proteins . 

- 


BL00522C 11.90 6 . 600e- 
18 412-436 BL00522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-16 279-326 
BL00522E 19.63 6 . I23e- 
14 502-532 BL00522F 
14.90 2.385e-13 551- 
575 


1563 


PF006S1 


BTB {also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.947e- 
11 46-59 


1564 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.823e- 
10 324-376 


1566 


BIj01013 


Oxysterol -binding 
protein family proteins . 


BL01013D 26.81 8 . S94e- 
17 184-228 BL01013C 
9.97 4 .906e-12 14-24 


1567 


BL00678 


Trp-Asp I WD) repeat 
proteins proteins . 


BL00678 9.67 3 .400e-10 
378-389 BL00678 9-67 
5.800e-l0 418-429 
BL00678 9.67 8.800e-10 
295-306 


1570 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 5.235e- 
17 297-313 BL00479A 
19.86 6.625e-15 271- 
294 BL00479A 19.86 
2.667e-14 147-170 
BL00479B 12.57 6.294e- 
12 173-189 


1576 


PR00665 


OXYTOCIN RECEPTOR 
SIGNATURE 


PR00665G 12.36 4.673e- 
24 364-384 PR00665D 
9.93 1.200e-22 138-155 
PR00665F 11.73 4.000e- 
22 337-354 PR00665C 
5.89 1.000e-20 65-80 
PR00665B 5.29 4.337e- 
19 24-39 PR00665E 
5.60 2.929e-15 246-260 
PR00665A S.99 5.622e- 
15 11-25 


1577 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE . 


DM00099B 14.73 9-308e- 
10 127-137 


1579 


BL00524 


Somatomedin B domain 
proteins . 


BL00524A 9.65 6.776e- 
14 52-73 


1580 


PD02894 


HYDROIASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894B 13.93 6-9S9e- 
16 182-215 PD02894A 
21.96 2.125e-10 57-103 


1581 


BL00411 


Kinesin motor domain 
proteins . 


BL00411C 15.04 5.292e- 
12 32-54 BL00411H 
15.66 4.441e-ll 245- 
276 


1582 


PR00604 


CLASS IA AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 2.440e- 
09 79-87 


1584 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 l.OOOe- 
10 225-238 


1585 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 9.455e- 
11 125-145 


1586 


DM01354 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 


DM01354S 11.61 7.750e- 
09 474-495 
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1587 


PRO 00 72 


MALIC ENZYME SIGNATURE 


PR00072B 13.77 7.95Se- 
33 180-210 PR00072A 
12.75 6.040e-25 120- 
145 PR00072C 11.42 
2.286e-24 216-239 
PR0OO72D 10.77 3.4 00e- 
22 276-295 PR00072E 
10.54 1.360e-19 301- 
318 PR00072G 10.45 
5.304e-19 433-450 
PR00072F 8.37 5.935e- 
15 332-349 


1589 


BL00191 


Cytochrome b5 family, 
heme-binding domain 
proteins . 


BL00191H 15.64 l.S37e- 
22 51-113 BL00191K 
17.38 9.027e-12 398- 
442 


1590 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III . 


DM01970B 8.60 7.716e- 
13 211-224 DM01970B 
8.60 2.157e-12 94-107 


1591 


DM00517 


5 kw NUCLEAR 60.7 NUP1 
CHROMOSOME . 


DM00517B 10.96 6.62Se- 
16 117S-1193 DM00517A 
8.21 1.000e-ll 1015- 
1025 


1592 


BL00037 


Myb DNA- binding domain 
proteins repeat proteins 
proteins . 


BL00037B 15.92 3.250e- 
27 116-142 BL00037A 
16.68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 BL00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9.654e- 
10 146-164 


1595 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 1.514e- 
09 110-127 


1598 


PF00628 


PHD-f inger . 


PF00628 15.84 3.250e- 
11 1667-1682 


1599 


PRO 0014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR0Q014D 12.04 5.50Oe- 
09 980-995 


1600 


BL0 0 518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 6.571e- 
10 30-39 


1602 


BL00412 


Neuromodulin (GAP-4 3) 
proteins . 


BL00412D 16.54 5.402e- 
10 136-187 


1605 


PF00651 


BTB (also known as BR- 
C/Ttk) domain protein's. 


PF00651 15.00 3.57ie- 
10 44-57 


1607 


BL00252 


Interferon alpha, beta 
and delta family 
proteins. 


BL00252A 18.49 6.657e- 
23 20-57 BL00252B 
19.78 9.125e-16 58-109 


1610 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 l.OOOe- 
08 61-94 


1611 


BL00904 


Protein 

prenyl transferases alpha 
subunit repeat proteins 
proteins . 


BL00904C 8.98 7.353e- 
10 91-125 BL00904D 
1.47 6.018e-09 127-168 


1612 


PF00168' 


C2 domain proteins. 


PF00168C 27.49 3.250e- 
09 365-391 


1613 


BL00412 


Neuromodul in (GAP- 4 3 ) 
proteins . 


BL00412D 16.54 6.0Sle- 
09 932-983 BL00412D 
16.54 7.1S3e-09 933- 
984 


1614 


BL00559 


Eukaryotic molybdopterin 

oxidoreductases 

proteins. 


BL00559I 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL0C559J 19-63 
6.870e-16 124-176 
BL00559L 13.60 9.000e- 
16 266-284 


1615 


PD01427 


TRANS FE RAS E 
METHYLTRANS FERASE BI . 


PD01427B 22.45 3-02Se- 
22 500-541 PD01427A 
19.94 8.773e-18 439- 
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472 


1616 


BL00115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL0011SZ 3.12 7.485e- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-194 


1617 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 7.750e~ 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 


1618 


BL01254 


Fetuin family proteins. 


BL01254F 10.02 8.754e- 
09 137-147 


1619 


PD01888 


PEPTIDE REDUCTASE 
PROTEIN METHI. 


PD01888B 25.10 l.OOOe- 
40 47-97 PD01888C 
21.56 7.000e-30 125- 
155 PD01888A 12.84 
8.800e-15 7-23 


1621 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.455e- 
09 692-704 PR00239E 
1.58 4.580e-09 697-709 
PR00239E 1.58 4.580e- 
09 702-714 PR00239E 
1.58 5.193e-09 703-715 


1622 


PR00860 


VERTEBRATE 

METALLOTHIONEIN 

SIGNATURE 


PR00860B 7.04 1.900e- 
18 27-41 PR00860C 
9.61 1.474e-14 41-51 
PR00860A 5.46 1.720e- 
14 5-18 


1624 


PR00734 


MITOCHONDRIAL BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 8.027e- 
11 77-95 


1626 


BL00325 


Actin-depolymenzing 
proteins . 


BL00325B 21.66 l.OOOe- 
40 93-139 BL00325A 
24.83 6.786e-23 61-93 


1631 


BL00064 


L-lactate dehydrogenase 
proteins . 


BL00064B 23.57 l.OOOe- 
40 82-130 3L00064C 
17.28 l.OOOe-40 137- 
182 BL00064E 27.20 
l.OOOe-40 223-275 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 
21.16 1.000e-33 22-60 
BL00064D 14.19 6.500e- 
31 182-212 


1632 


PRO 0063 


RIBOSOMAL PROTEIN L27 
SIGNATURE 


PR00063B 15.24 9.700e- 
11 59-84 PR0O063A 
11.71 1.614e-09 34-59 


1634 


PRO 023 9 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PRO0239D 0.00 l.lOSe- 
11 36-49 PR00239C 
3 .51 2.538e-09 37-45 


1636 


BL01210 


Caveolins proteins. 


BL01210B 13.92 9.531e- 
10 133-183 


1637 


BL00982 


Bacterial- type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 5.388e- 
11 11-43 


1639 


BL01183 


ubiE/COQ5 

methyl transferase family 
proteins . 


BL01183B 21.31 8.144e- 
12 132-177 


1640 


PR00015 


GRAM- POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 


PR00015B 9.84 8.468e- 
10 128-149 


1641 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-lO 279-294 
PR0O32OC 13-01 2.800e- 
10 364-379 PR00320B 
12.19 5.114e-10 279- 
294 PR00320A 16.74 
1.659e-09 279-294 
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PR rtfiiTna id i / *> n a a 
rRUUJ^UA lb. /«* ^.U!J8e — 

09 229-244 


1642 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 6.464e 7 
09 114-130 


1643 


PR00169 


POTASSIUM CHANNEL* 
SIGNATURE 


PR00169A 16.77 1.806e- 
11 74-94 


1644 


BL.00678 


Trp-Asp (WD) repeat 
TMTOt eins nrnhp i nq 


BL00678 9.67 2.200e-10 
luy-l^iO BJj00678 9.67 
5.737e-09 528-539 


1645 


BL01108 


Ribosomal protein L24 
proteins . 


BL01108A 20.33 7.366e- 
17 56-89 


1646 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.270e- 
21 103-125 PR00380D 
9.93 6.30Qe-18 '386-408 
PR00380C 13.18 7.923e- 
16 332-351 PR0038OB 
12.64 6.657e-15 292- 
310 


1647 


DM01242 


J ixLKJbwJM J.JMfc,- - lHUA 
LI GAS E . 


DM01242C 17.15 9.791e- 
37 340-381 DM01242E 
23.00 5.071e-31 463- 
505 DM01242D 23.29 
3.925e-30 420-463 
DM01242B 23.57 8.054e- 
18 265-314 DM01242F 
10.61 7.618e-14 526- 
540 


1649 


PD00126 


PROTEIN REPEAT DOMAIN 
tow Miirr pis 


PD00126A 22.53 5.500e- 
10 13-34 


1651 


BL.01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 6.720e- 
11 431-485 


1652 


BLO0933 


FGGY family of 
carbohydrate kinases 
proteins . 


BL00933A 17.50 4.673e- 
12 11-35 BL00933E 
13.80 9.217e-09 456- 
472 


1653 


BL00795 


iiivoiucxin procems . 


BL007S5C 17.06 2.988e- 
10 70-115 


1654 


BI>00982 


13 f— /-> >" ~t -J J f. . m M V% » ^ i— n VI rf-V 

odtceriax-cype poytoene 
dehydrogenase proteins. 


BL00982A 18.41 7.750e- 
17 302-334 | 


1655 


BL009B2 


Bacterial -rype phytoene 
dehydrogenase proteins . 


BL00982A 18.41 7.750e- j 
17 282-314 


1656 


BI»00741 


vjuanine-nucicOclQe 

dissociation stimulators 
CDC24 familv <iian 


BL00741B 14.27 1.391e- 
16 607-630 


1657 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 7.938e- 
11 114-136 


1658 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.889e- 
10 442-455 


1659 


BIjO0972 


Ub i au it in rarhnwl — 
terminal hydrolases 
family 2 proteins. 


J5i>UUy /2L> 2.4 . 55 4.14 0©- 
12 376-401 BL00972E 
20.72 5.629e-09 446- 

ACQ 


1660 


BL00406 


Act ins proteins. 


BL00406D 12.58 8.767e- 
15 188-243 


1661 


PR00105 


CYTOSINE- SPECIFIC DNA 
METHYL/TRANS FERAS E 
SIGNATURE 


PR00105A 10.36 4.900e- " 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10.86 
1.000e-10 1305-1319 


|1662 


BL00280 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BL00280 24.61 3.172e- 
33 3119-3163 


1663 


PR00319 " " 


BETA G- PROTEIN T 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.7l4e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 6.200e-19 70-85 
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1664 


BL0O018 


EF-hand calcium- binding 
domain proteins - 


BL00018 7.41 5.050e-10 
489-502 


1667 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 8.500e- 
38 7-46 


1669 


BL01153 


NOLI /NOP2/ sun family 
proteins . 


BL01153D 19.69 l.l88e- 
17 115-141 BL01153C 
13 .67 8.977e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PRO 067 8 


PI3 KINASE P85 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 3.100e- 
10 1146-1169 


1672 


BL00598 


Chromo domain proteins. 


BL00598 14.45 8.500e- 
20 27-49 


1673 


PRO 03 2 6 


GTPl/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.329e- 
09 686-707 


1674 


PRO 0049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PROC049D O.00 7.580e- 
11 343-358 PRD0049D 
0.00 1.286e-10 342-357 


1676 


PR0O747 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR0074 7H 12.76 8.636e- 
19 427-448 PR00747G 
14.50 2.286e-18 368- 

ri\.Uu / *± / x *l . wo 

7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747D 
15.23 8.759e-l7 163- 
183 PR00747E 15.13 
8.244e-15 254-272 
PR00747B 7.65 5.355e- 
13 75-90 PR00747F 
13.56 8.714e-10 311- 
3 2 8 


1677 


PRO 074 7 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR00747H 12.76 8 . 636e- 
19 309-330 PR00747G 

JL • \J • Z» O O tZ- JL \J \J 

275 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747B 
7.65 5.355e-13 75-90 
PR00747F 13.56 8.714e- 
10 193-210 


1680 


BL0067B 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.600e-10 
406-417 BL00678 9.67 
6.684e-09 320-331 


1681 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.6C0e-10 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


PR00326 


GTPl/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.34 6e- 
13 389-410 


1685 


PRO 064 6 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00646H 6.32 4.188e- 
09 755-771 


1690 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 75-129 


1691 


PR00456 


RIBOSOMAL PROTEIN" P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 
PR00456E 3.06 8.125e- 
10 420-435 


1692 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 487-502 PR00456E 
3.06 7.281e-10 488-503 
PR00456E 3.06 8.12Se- 
10 489-504 


1693 


BL00674 


AAA- protein family- 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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4.46 4.000e-23 241-263 
BL00674D 23.41 8.560e- 
18 338-385 BL00674E 
15.24 1.720e~15 414- 
434 


1697 


PR00409 


PHTHALATE DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR00466 


CYTOCHROME B-245 HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466B 
5.03 5.500e-ll 162-186 
PR00466F 9.16 6.159e- 
09 498-517 


1699 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.217e- 
12 283-300 BL00028 
16.07 3.769e-ll 255- 
272 BL00028 16.07 
5.1S4e-ll 171-188 
BL00028 16.07 S.5O0e- 
11 227-244 BL00028 
16.07 1.600e-10 199- 
216 


1700 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 3.348e- 
15 62-102 BL01019B 
19.49 4 .000e-15 107- 
162 


1703 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PR0D109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.558e- 
14 134-153 


1710 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.565e- 
10 116-130 PR00019B 
11.36 4.6OOe-09 113- 
127 PR00019B 11.36 
7.120e-09 204-218 


1711 


BL01159 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 6.523e- 
11 232-247 BL01159 
13.85 5.408e-10 613- 
628 


1712 i 




AnJc repeat proteins . 


PF00023A 16.03 7.000e- 
10 187-203 


1713 




Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1714 


PF00642 


Zinc finger C~x8 -C-x5-C- 
X3-H type (ana similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1715 


BL0111S 


GTP- binding nuclear 
protein ran proteins . 


BL01115A 10.22 7.129e- 
09 7-51 


1718 


BL00353 


HMG1/2 proteins. 


BL00353C 14.83 6.018e- 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


1719 




Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 5.408e- 
09 432-483 


1721 


BLD003B 


Myc-type, * helix- loop - 
helix' dimerization 
domain proteins. 


BL00038B 16.97 8.448e- 
12 79-100 BL00038A 
13.61 4.000e-ll 52-68 


1723 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


Dn n n ^ £ "7 r* o i ~j a tnno. 

trU uUDO j . JL / o . ouue — 

09 418-428 


1724 


BLD1279 


Protein-L- 
isoaspartate (D- 
aapartate) 0- 
methyl transferase signa. 


BL01279A 24.27 5.663c- 
12 233-281 


1728 


BL00018 


EF-hand calciun-binding 
domain proteins. 


BL0001B 7.41 2.059e-ll 
73-86 ' BL00018 7.41 
4.176e-ll 157-170 


1730 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1.089e- 
09 17-61 
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1731 


BL01160 


Kinesln light chain 
repeat proteins . 


BL01160B 19.54 9-676e- 
10 296-350 


1732 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 9.676e- 
10 316-370 


1733 


PF00850 


Histone deacetylase 
family . 


PF00850P 15.70 4.349e- 
22 246-279 PF00850D 
14.76 6.850e-20 177- 
201 PF00850E B.88 
8.691e-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 


1734 


BL003 54 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 5.932e- 
09 292-307 


1735 


DM0O179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.263e- 
10 492-S02 


1743 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1744 


PR0O44 9 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- . 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1745 


BL0O72O 


Guanine-nucleotide 
dissociation, stimulators 
CDC25 family sign. 


BL00720B 16.57 8.297e- 
15 136-160 


1746 


PR00081 


GLUCOSE/ RI BITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.727e- 
11 45-57 PR00081E 
17.54 3.935e-10 150- 
168 


1747 


BL0043 9 


Acyltransf erases 
ChoActase / COT / CPT 
family proteins. 


BL00439H 18.24 8.435e- 
14 65-91 BL00439G 
13.40 2.895e-12 3-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.158e- 
11 4-20 


1751 


PD00066 


PROTEIN ZINC- FINGER 
METAL-BINDI. 


PD00066 13.92 3.400e- 
14 33-46 PD00066 
13.92 1.000e-13 89-102 
PD00066 13.92 7.000e- 
13 61-74 PD00066 
13.92 6.571e-12 117- 
130 


1753 


BL01013 


Qxys terol - binding 
protein family proteins. 


BL01013D 26.81 6 . 516e- 
18 33-77 


1754 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.393e- 
09 490-521 BL00790I 
20.01 2.821e-09 60-91 
BL00790I 20.01 6.357e- 
09 287-318 


1756 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.750e- 
35 10-49 


1758 


DM00406 


GLIADIN. 


DM00406 7.73 7.600e-09 
653-666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.529e- 
09 224-278 


1765 


PR00326 


GTP1/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 3.077e- 
14 523-539 


177 6 


BL0OS42 


glpT family of 
transporters proteins. 


BL00942F 15.07 4.343e- 
10 371-389 BL00942B 
20.36 8.040e-09 94-137 


1777 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 
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1778 


BL00084 


Copper type II, 

a s corba t e - dependent 

raonooxygenases proteins . 


BL000B4D 25.11 3.700e- 
20 l69-2?4 TVF.OOAftAn 
24.26 8.134e-16 10-58 
BL00084C 27.71 8.412e- 
11 107-158 


1779 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.831e-15 344- 
380 BL01013C 9.97 
6.308e-13 435-445 
BL01013B 11.33 3.717e- 
12 409-420 


1783 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 


1784 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



* results include in order: accession number subtype; raw score; p-value; postion of 
signature in amino acid sequence. 
TRADOCS: 1 4 16223.I (%CRJ0 1 1.DOC) 
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SEQ ID 
NO; 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


2 




Immunoglobulin domain 


2.1e-32 


109.5 


3 


pkinase 


Eukaryotic protein kinase 
domain 


1.3e-29 


110.7 


4 


zf-C2H2 


Zinc finger, C2H2 type 


1.6e-21 


84 .9 


5 


fn3 


Fibronectin type III domain 


0 


1097 .1 


6 


fn3 


Pibronectin type III domain 


0 


1035.0 


7 


fn3 


Pibronectin type III domain 


0 


1090.4 


8 


fn3 


Flbronectin type III domain 


0 


1097.1 


9 


TBC 


TBC domain 


4e-40 


146.7 


10 


p4 50 


Cytochrome P450 


9.5e-l7 


62.0 


12 


ank 


Ank repeat 


6e-20 


79.7 


14 


ig 


Immunoglobulin domain 


1. 7e-05 


22 .7 


15 


zf-MYND 


MYND finger 


1. 3e-06 


35.4 


16 


zf-MYND 


MYND finger 


1 . 3e-06 


35 .4 


17 


zf -C2H2 


Zinc finger, C2H2 type 


1.7e-99 


343.9 


18 


CAP_GLY 


CAP-Gly domain 


1 . 2e-25 


98.7 


20 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


1.6e-119 


410.5 


21 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


4 3e-102 


352 . 6 


22 


pkinase 


Eukaryotic protein kinase 
domain 


2 . 4e-79 


277 . o 


23 


pkinase 


Eukaryotic protein kinase 
domain 


8 . 4e-74 


258.6 


25 


RNAjpol A 


RNA polymerase alpha subunit 


o 


1077 . 7 


26 


Clq 


Clq domain 


1 . 9e-lC 


44 . 4 


27 


Ribosomal L»2 
3 


Ribosomal protein L23 


7 . 8e-32 


111.2 


28 


Ribosomal L.2 
3 


Ribosomal protein L23 


le-29 


104 .2 


30 


zf-A20 


A20-like zinc finger 


1 . 5e-10 


48.5 


31 


zf-A20 


A20-like zinc finger 


1 . 5e-l0 


48.5 


32 


FMN dh 


FMN- dependent dehydrogenase 


5.4e-179 


608.1 


34 


PID 


Phospho tyrosine interaction 
domain (PTB/PID) 


3 p g e ^5g 


209 . 9 


35 


*g 


Immunoglobulin domain 


1 .4e-13 


48.8 


36 


ig 


Immunoglobulin domain 


1 . 4e-13 


48 . 8 


40 


kinesin 


Kinesin motor domain 


6 . 7e-76 


26S.6 


44 


Ets 


Ets -domain 


1.4e-S6 


182.1 


45 


Et3 


Ets -domain 


1.4e-56 


182 .1 


46 


LRR 


Leu cine Rich Repeat 


1. 7e-13 


58 .3 


48 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-162 


552.8 


49 


IT AM 


Immunoreceptor tyrosine -based 
activation mot 


1.4e-05 


31.9 


50 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


l.le-26 


102.0 


51 


UCH-2 


Ubiquitin carboxyl -terminal 
hydrolase family 


l.le-26 


102.0 


52 


ras 


Ras family 


8.Se-45 


162.3 


53 


PRK 


Phosphoribulokinase 


2.1e-65 


230.7 


54 


myb_DNA- 
binding 


Myb-like DNA-binding domain 


0.096 


15.2 


55 


voltage__CLC 


Voltage gated chloride channels 


3.3e-186 


631.9 


56 


sugar_tr 


Sugar (and other) transporter 


0.00015 


-64.3 


57 


TBC 


TBC domain 


2.2e-37 


137.6 


58 


ank 


Ank repeat 


5.9e-2S 


96.3 


59 


ank 


Ank repeat 


5.9e-25 


96 .3 


67 


PMP22_Claudi 
n ' 


PMP-22/BMP/MP20/Claudin family 


7.9e-49 


175.6 


68 


C2 


C2 domain 


7.9e-S4 


192 .2 


69 "~ 


C2 


C2 domain 


2 .3e~54 


194 .0 


70 


Kelch 


Kelch motif 


9.4e-99 


341.5 


72 


ig 


Immunoglobulin domain 


8.2e-2B 


94 .7 


73 


pkinase 


Eukaryotic protein kinase 


8e-69 


242.1 
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NO: 


PFAM NAME 
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PFAM 
SCORE 












74 


pkinase 


Eukarvotic DrotPl n l-inaco 

domain 


2 . 8e-38 


140. 6 


76 


zt- 

C4_Topoi scan 


Topoisomerase DNA binding C4 
zinc fing 


5.4e-54 


192.8 


83 


Peptidase S9 


Prolyl oligopeptidase family 


4 .3e-10 


36.8 


84 


fn3 


Fibronectin tvo Q I IT Anmz* -J n 


4 . le-51 


183 . 2 


86 


SH2 


Src homology domain 2 ■ — 


3 - le-22 


67 . 7 


88 


ig 




0 . 0091 


14 . 0 


89 


WD4 0 


WD domain, G-beta repeat 


2 . le-21 


84 - 6 


92 


laminin G 


Laminin G domain 


6.1e-27 


98 .5 


93 


AMP-binding 


AMP-binding enzyme 


2 .4e-l3 


-37 . 2 


95 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-59 


211.4 


96 


pkinase 


Eukaryotic protein kinase 
domain 


2.6e-51 


183 .9 


97 


adh short 


short chain dehydrogenase 


2e-61 


217.5 


98 


kinesin 


{vxucoiu muLor domain 


2 .2e-86 


300.4 


101 


IRS 


* a o s-tcjiiicim \ i. rto — j. type J 


5.4e-36 


133.0 


102 


AAA 


ATPases associated with various 

r*£* 1 Till av « f- 


6 . 8e-05 


-5.2 


104 


pkinase 


Eukaryotic protein kinase 
domain 


2 . 7e-73 


256.9 


106 
107 


ras 
FYVE 


FYVE zinc finger 


8.3e-24 


92 .5 


108 


v_jf l_ LCUUU Lab 

e 


*au/nau- binding Cytochrome 
reductase 


5.4e-27 
7.7e-61 


100.7 
215.5 


109 


zf -C2H2 


Zinc finger, C2H2 type 


2 -3e-122 


420.0 


113 


pkinase 


Eukaryotic protein kinase 
domain 


4e-88 


306.2 


116 
"117 


PH 


PH domain 


3.1e-ll 


45.2 






Lipocalin / cytooolic fatty- 
acid binding pr 


2.4e-14 


53.5 


118 


pkinase 


Eukaryotic protein kinase 
domain 


4 .5e-20 


76.3 


120 


WD4 0 


WD domain, G-beta repeat 


2 .4e-14 


61.1 j 


121 


WD40 


ytu aotnain, v>— joeta repeat 


2 .4e-14 


61.1 


123 
124 


IF5_eIF4_eIF 
2 

*g 


eIF4-gamma/eIF5/eIF2-epsilon 
immunoglobulin domain 


le-32 


122 .2 


127 


mito carr 


Mitochondrial carrier proteins 


6 . 5e-08 
3e-16 


30 .6 
58 .6 


128 


PP2C 


Protein phosphatase 2C 


2 .2e-71 


250.6 


129 
130 


ATP1G1_PLM_M 

ATS 

pfkB 


ATP1G1/PLM/MAT8 family 

pfJcB family carbohydrate kinase 


3 .le-20 


80.6 


133 


ACBP 


n *—y -*• ujLiiuiny procem 


4 .5e-42 
4 . 6e-22 


137.1 
86 . 7 


134 
135 


rnn 

IQ ~ 


■"^■"^ xe^u^iuuiDn uiOLlt . 

IQ calmodul in-binding motif 


1 .2e-31 


118.5 


136 
"139 


ATP1G1 PLM M 
AT8 


ATP1G1/PLM/MAT8 family 


2 .6e-08 
9.3e-22 


41.0 
85.7 


140 


WH2 

zf -C2H2 


n.LUi.iLi.u byiiULon le 
homology region 2 
Zinc finger, C2H2 type 


0 . 0067 


23 .1 


141 


Peptidase S2 
6 


Signal peptidase I 


1.7e-82 | 
5.7e-10 


287.5 
35.7 


143 


art 


ADP-ribosylation factor family 


1.2e-39 


145.2 


146 
14 8 


KRAB 
DUF6 


KRAB box 

Integral membrane protein DUF6 


7.3e-30 


112 . 6 


149 


PDEase 


3 '5' -cyclic nucleotide 
phosphodiesterase 


0.096 
3 . 8e-80 


8.0 
231.1 


151 
153 


S4 

tRNA-synt_ld 


S4 domain 


l.le-08 


42.3 


154 

155 
157 


Cyt_reductas 
e 

ras 

actin j 


tRNA synthetases class I (R) j 

*ad/nad -binding Cytochrome 

reductase 

Has family 

Actin 


3 -8e-103 
7.8S-60 

3 -6e-28 
3 . 8e-26 


356.1 
212.2 

107.0 
87.! 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


158 


Jacalin 


Jacalin-like lectin domain 


0.09 


-24 .9 


160 


Zn_carbOpept 


Zinc carboxypeptidase 


5e-138 


471 .9 


165 


pkinase 


Eukaryotic protein kinase 
domain 


5.1e-67 


236 .1 


167 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-07 


27.0 


168 


Ribosomal_Sl 
5 


Ribosomal protein S15 


l.le-06 


29 . 0 


169 


DEAD 


DEAD/DEAH box helicase 


le-48 


157 . 0 


171 


DUF59 


Domain of unknown function 
DUF59 


0.07 


-17 .4 


172 


pkinase 


Eukaryotic protein kinase 
domain 


3.7e-15 


58.6 


173 


globin 


Glob in 


4 .6e-18 


67 .4 


174 


WW 


WW domain 


7.3e-06 


32 . 9 


175 


ras 


Ras family 


le-31 


118 .8 


178 


ATP1G1_PLM_M 
AT8 


ATP1G1/PLM/MAT8 family 


2.5e-17 


71 .0 


179 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-99 


344 .2 


180 


Clq 


Clq domain 


8.8e-72 


251.9 


190 


Yjphosphatas 
e 


Protein-tyrosine phosphatase 


4 .9e-287 


967.0 


191 


efhand 


EF hand 


7.5e-16 


66.1 


193 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-82 


285.6 


194 


br omodoma i n 


B r omodoma i n 


5.8e-31 


111 .4 


195 


PALP 


Py r i doxa 1 - ph o spha t e dep ende n t 
enzyme 


2.5e-€U 


227 .1 


197 


DnaJ 


Dnaa domain 


1.6e-38 


141.4 


199 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


0.00018 


16.9 


200 


acid_phospha 
t 


Histidine acid phosphatase 


2.5e-10 


37.2 


201 


WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.00048 


26.9 


204 


vATP- 
synt_AC39 


ATP synthase (C/AC39) subunit 


1.3e-159 


543 .7 


205 


VATP- 
synfc_AC39 


ATP synthase (C/AC39) subunit 


1.6e-139 


476.9 


206 


ldl_recept_a 


Low-density lipoprotein 
receptor domain 


2.4e-25 


97.6 


209 


ank 


Ank repeat 


1.4e-19 


78.4 


210 


Rhomboid 


Rhomboid family 


0.0035 


1.2 


211 


Clq 


Clq domain 


1.6e-70 


247.7 


212 


UQ_con 


Ubi qui tin -conjugating enzyme 


?.4e-74 


258.8 


213 


UQ_con 


Ubiquitin- conjugating enzyme 


le-53 


191.9 


215 


DEAD 


DEAD/DEAH box helicase 


1.8e-43 


140.4 


216 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


4.5e-21 


83.4 


218 


Glycos trans 
f _2 


Glycosyl transferases 


4e-21 


83 .6 


219 


ig 


Immunoglobulin domain 


0.092 


10 .7 


222 


WD4 0 


WD domain, G-beta repeat 


7.4e~23 


89.4 


224 


TPR 


TPR Domain 


1.2e-08 


42 .1 


225 


Dnaa CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141.5 


226 


Dnaa_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


l.Se-38 


141.5 


229 


HSP70 


Hsp70 protein 


2 .4e-54 


194.0 


230 


GSHPx 


Glutathione peroxidases 


3.4e-47 


170.2 


231 


tsp_l 


Thrombospondin type 1 domain 


0.0075 


17.1 


233 


cyclin 


Cyclin 


4 .6e-144 


492.0 


234 


ras 


Ras family 


4.6e-50 


179.7 


235 


LRR 


Leucine Rich Repeat 


1 .2e-30 


115.3 


236 


LRR 


Leucine Rich Repeat 


6.7e-29 


109.4 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.7e-09 


45.0 
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DESCRI PTION 
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PFAM 
SCORE 


244 


H CMP r~*-\rt- Jao 

m 


Cy tidine and deoxycyt idylate 
deaminase 


2 .5e-0S 


31.1 


245 


*9 


Immunoglobulin domain 


6.7e-08 


30.5 


248 




wnt family of developmental 
signaling protei 


9.1e-270 


742 .6 


250 


mito carr 


Mitochondrial carrier proteins 


1.3e-55 


193 .6 


~254 


adenyla tekin 
ase 


•*-M-*t= ny i a l. e Kinase 


1 . 8e-14 


55. 7 


255 


Cation efflu 
x 


Motion ernux i ami JLy 


2 . 8e-33 


124 .0 


256 


SH3 


SH3 domain 


3 . 9e-14 


60.4 


257 


Aa trans 


Transmembrane amino acid 
transporter protein 


2 . 6e-52 


187.2 ! 


258 


adenylatekin 
ase 


Adenylate kinase 


2.le-H0 


380.2 


259 


HIT 


nil tamiiy 


8 . 2e-07 


25.3 


260 


Bacterial pq 

Q 


PQQ enzyme repeat 
. — . 


i.6e-15 


65.0 


262 


\-> ^ wlco o unit: 


proteasome A- type and B- type 


6 .5e-64 


225. 7 


267 




Eukaryotic protein kinase 
domain 


6 .3e-27 


101.0 


270 


f" "i I ;» m» r* t* 


Intermediate filament proteins 


3 . 2e-150 


512 .5 


271 


Choline_kina 


Cholme/ethanolamine kinase 


2e-67 


237.4 ; 


277 


Ribosomal S7 


Ribosomal protein S7p/S5e 


3 .3e-20 


80.6 


279 


pkinase 


Eukaryotic protein kinase 
domain 


3 . 3e-77 


269.9 


280 


WD40 


WD domain, G-beta repeat 


7 . 8e-73 


255.4 


281 


WD4 0 


WD domain, G-beta repeat 


7 . 8e-73 


255.4 


284 




unms zanc tinger domain 


4 . 6e-24 


93 .4 


287 


Exonuclease 


Exonuclease 


1 .4e-67 


238.0 


291 


SAM 


SAM domain (Sterile alpha 
motif) 


0 .034 


11.2 


292 


SAM 


SAM domain (Sterile alpha 
motif) 


0 . 034 


11.2 


294 




Zinc finger, C2H2 type 


1 .4e-29 


111.7 


295 


zf-C2H2 


Zinc finger, C2H2 type 


2 .2e-125 


430.0 


"296 


uulq c^irr 


Mitochondrial carrier proteins 


4 .le-59 


205.5 


297 


HMG_box 


HMG (high mobility group) box 


6.7e-29 


109.4 


3 02 


Gly cob t ran s 

f 4 


Glycosyl transferase 


5e-87 


302.5 


304 




tRWA synthetases class II (D, K 


l.le-84 


294.8 


305 


KRAB 


ivxtrVB JDOX 


2e-44 


161.0 


306 


rrm 


recognition moult • 


2.7e-44 . 


160.6 


308 


7tm l 


' transmembrane receptor 
(rhodopsin family) 


5 .2e-39 


126.1 


309 


seX 


DNA polymerase X family 


2 .4e-64 


227.2 


311 


F-box ~ 


*■ U.vJ 1 Uct m. 


9 . 5e-08 


39.2 1 


312 


ig 


j.iiuuuiiuyxuuui in domain 


6 . 8e-19 


65 . 9 


"313 


Ets 


Et S — dnma i n 


8 . le-60 


192 . 3 


315 


Kelch 


Kelrh root- -? f "" 


1 . 3e-l06 


367 . 6 


317 


'arf 


r^ux- i. ujui>y±u c ion zaccor xamny 


3 . 2e-35 


130 . 4 


318 


sugar_cr 


Sugar (and other) transporter | 


0.0003 


-73.1 


320 


pkinase 


cLLR.d.ryoL i c protein Kinase 
domain 


8 . le-83 


288 . 6 


322 


pkinase 


Eukaryotic protein kinase 
domain 


4 . 9e-81 


2 82 . 6 


324 


XI ink 


Extracellular link domain 


4 -5e-143 


331. S 


326 


ARID 


ARID DNA binding domain 


5.1e-37 


136.4 


327 


HMG box 


HMG (high mobility group) box 


6.7e-29 


109.4 


328 


cadherin 


Cadherin domain 


8.1e-81 


281.9 


331 


chronio 


• enromo • (CHRroraatin 
Organization Modifier) 


4e-18 


66.7 


333 


Peptidase M2 
2 


Glycoproteaoe f ami ly j 


1.2e-l36 


467.4 
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335 


vwa 


von Willebrand factor type A 
domain 


2 .3e-07 


37.9 


339 


ras 


Ras family 


7 . 8e-07 


-59.1 


340 


zf -C2H2 


Zinc finger, C2H2 type 


8 .2e-64 


225.4 


342 


zf -C2H2 


Zinc finger, C2H2 type 


2 .4e-85 


297.0 


343 


*g 


Immunoglobulin domain 


0.0005 


18.0 


346 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-65 


229.1 


347 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-65 


229.1 


351 


EGF 


EGF -like domain 


8.5e-20 


79.2 


352 


ank 


Ank repeat 


2 .5e-101 


350.0 


354 


TBC 


TBC domain 


5.le-15 


63.3 


355 


PHD 


PHD- finger 


3 .2e-07 


37.4 


358 


DUF6 


Integral membrane protein DUF6 


0.033 


15.8 


359 


zf -C2 H2 


Zinc finger, C2H2 type 


7.4e-20 


79.4 


361 


ank 


Ank repeat 


6 .6e-34 


126.1 


362 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


4 .7e-53 


189.7 


363 


ef hand 


EF hand 


5 .4e-10 


46.6 


367 


LRR - 


Leucine Rich Repeat 


8 .8e-44 


158.9 


368 


laminin_G 


Laminin G domain 


1 .5e-33 


121.7 


369 


PP2C 


Protein phosphatase 2C 


5 .3e-20 


73.9 


372 


LIM 


LIM domain containing proteins 


9 .9e-15 


57.1 


373 


KRAB 


KRAB box 


4 .8e-23 


90.0 . 


3 76 


ion_ trans 


Ion transport protein 


2 .9e-09 


-4.2 


377 


Beach 


Beige /BEACH domain 


4 .9e-208 


704 .5 


380 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-94 


327.5 


381 


AMP -binding 


AMP- binding enzyme 


1 .4e-07 


-140.3 


382 


HECT 


HECT-domain (ubiquitin- 
transf erase) . 


1 .3e-07 


-13 .5 


384 


ank 


Ank repeat 


2.5e-l-01 


350.0 


386 




Immunoglobulin domain 


9 .5e-06 


23 .6 


388 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-42 


154.6 


389 


*g 


Immunoglobulin domain 


2.8e-15 


54 .3 


390 


mito_carr 


Mitochondrial carrier proteins 


3.5e-67 


233.2 


392 


TPR 


TPR Domain 


6.1e-17 


69.7 


393 


SH3 


SH3 domain 


3 .5e-09 


43.9 


394 


AAA 


ATPases associated with various 
cellular act 


4 .le-21 


83 .6 


396 


spectrin 


Spectrin repeat 


2-le-67 


237.3 


397 


zf-C2H2 


Zinc finger, C2H2 type 


0 .0066 


23 .1 


399 


fn3 


Fibronectin type III domain 


4 .le-102 


352 .6 


400 


WD40 


WD domain, G-beta repeat 


0 .00049 


26.8 


401 


El__dehydrcg 


Dehydrogenase El component 


3e-119 


409 .6 


402 


fn3 


Fibronectin type III domain 


0 


1719.6 


404 


LRR 


Leucine Rich Repeat 


2.1e-10 


48.0 


405 


cadherin 


Cadherin domain 


8.le-81 


281 . 9 


406 


zf-CXXC 


CXXC zinc finger 


5e-15 


63.4 


410 


RhoGBF 


RhoGEF domain 


l.le-23 


92.1 


411 


F-box 


F-box domain. 


4 .2e-06 


33 .7 


412 


SNF2_N 


SNF2 and others N- terminal 
domain 


5.8e-16 


61.6 


415 


CPSase_L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


1.5e-172 


586.6 


418 


LRR 


Leucine Rich Repeat 


3.8e-24 


93.6 


419 


DENN 


DENN (AEX-3) domain 


2e-58 


207. S 


420 


RasGEF 


RasGEF domain 


8.1e-43 


155.7 


421 


ank 


Ank repeat 


1.4e-153 


523.7 


424 


G-patch 


G-patch domain 


le-19 


78.9 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117.1 


426 


Plexin repea 
t 


Plexin repeat 


0 .0023 


24 .6 


427 


Plexin_repea 


Plexin repeat 


0.0023 


24.6 
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t 


DESCRIPTION 


p -value 


PFAM 
SCORE 


429 

431 
432 


zf-C3HC4 

DEAD "~ 
SH3 


Zinc finger, C3HC4 type (RING . 
£ inger ) 

DEAD/DEAH box helicase 

QT?^ — Ar^ m = i n " — 


8.6e-ll 
le-66 


39.2 
214.0 


433 
436 


GTP CDC 


Cell division protein 
Collagen triple nelix repeat 
(20 copies) 


3.4e-16 
2.1e-ll4 
4 .6e-194 


67.2 
393.5 
658.1 "~ 


438 


Ricin B lect 
in 


Similarity to lectin domain of 
ricin b 


0 .0085 


10.5 


441 


Alpha adapt i 
n C 


Alpha adaptin carboxyl- terminal 
domai 


1.2e-256 


866.0 


442 


Alpha adapt i 
n C 


Alpha adaptin carboxyl- terminal 
domai 


1.8e-235 


795.7 


443 


PD2 


PDZ domain (Also known as DHR 
or GLGF) . 


l-9e-65 


230.9 


44S 

446 
,451 


LON 
sushi 


ATP- dependent protease La (LON) 
domain 

Immunoglobulin domain 
Sushi domain (SCR repeat) 


0 . 00012 
0 . 00011 


-17.1 
20.1 


452 
454 

456 


fn3 

*Ti uoxa x ae 

c 

kinesin 


Fibronectin type III domain 
Pyridoxal - dependent 
decarboxylase conse 
Kinesin motor domain 


1 . 4e-18 
1 .5e-06 
8 .3e-14 


75.2 
35.2 
50.3 


457 
458 


neur^chan 
Jbsephin 


Neurotransmitter- gated ion- 
channel 
Josephin 


4 . 9e-217 
le-175 

0 .0002 


734.4 
597.1 

18. 7 


468 
470 

471 


bZIP 

NTP_transfer 

ase 

WD4 0 


bZIP transcription factor 
iMucieocidyi transferase 

WD domain, G-beta repeat 


1 .7e-07 
6 .3e-06 

2e-28 


31.8 
-26.3 

107.9 


473 
477 

479 


LIM 

zf-RanBP 
WD40 


LIM domain containing proteins 
Zn " finger in Ran binding 
protein and others. 
WD domain, G-beta repeat 


0 .00021 
0 . 02 9 

6 .5e-18 


20.7 
21 . 0 

73.0 


480 

i 481 

I 

} 48S 


KRAB 
ArfGap 

SH2 


KRAB box 

Putative GTP -ase activating 

protein for Arf 

Src homology domain 2 ~ 


le-3 1 
8.4e-66 

0.011 


118 . 8 
232.0 

11.4 


486 
487 

489 


Clg 
dsrro 


Clq domain 

Double-stranded RNA binding 
motif 

Zinc finger, C2K2 type 


4 .3e-74 

l.le-47 


259.6 
171.9 


490 
492 


Alpha adapt i 

n C 

SKI 


Alpha adaptin carboxyl -terminal 
domai 

shikimate kinase 


4.8e-153 
3.4e-222 


521.9 
751.6 


497 


ENV nolvnrnh 

ein 


iswv poiyprotem (coat 
polyprotein) 


i.2e-io 
2.6e-22 


48.8 
77.6 


498 

500 
501 


abhydrolase 
2 

rrm 
WW 


Phosphol lpase/ Carboxyl es t erase 

RNA recognition motif. 
WW domain 


0.041 

5.4e-34 
4.6e-18 


-48 . 1 

126.4 
73.4 


502 
504 
505 


*9 

uvi4j ux wx doc 

vwa j 


Immunoglobulin domain 1 
aiphay beca hydrolase fold 
von wiileorand factor type A 
aomaxn 


l.le-10 

0.045 

7.1e-62 


39.5 
-3.6 
219.0 


508 
509 


Na_K ATPase 

c 

Exonuclease 


Na+/K+ ATPase C- terminus 

Exonuclease 


2.3e-145 


496.3 j 


510 


Glycos trans 
f_l 


Glycosyl transferases group 1 


1.3e-56 
2.9e-06 


201.5 
27.0 


511 


Glycos trans 
f 1 


Glycosyl transferases group 1 


2.9e-06 


27.0 


512 


Glycos trans 
f 1 


Glycosyl transferases group 1 


1.9e-09 


38.5 


514 


pro isomeras 

e 


cyclophilin type pep t idyl - 
prolyl cis-tr 


1.8e-63 
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SEQ ID 
NO: 


PPAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


515 


EGF 


EGF-like domain 


1. 9e-I8 


74 .7 


516 


Surp 


Surp module 


4.3e-38 


140 . 0 


523 


19 


Immunoglobulin domain 


3.3e-06 


25 .0 


526 


UBX 


UBX domain 


l.le-34 


128 . 6 


528 


adh_zinc 


Zinc -binding dehydrogenases 


2.7e-34 


127 .4 


530 


SAM 


SAM domain (Sterile alpha 
motif) 


0.046 


10.0 


531 


adh_short 


short chain dehydrogenase 


0.O025 


-34 .1 


532 


mlto_carr 


Mitochondrial carrier proteins 


2.Se-81 


281,7 


533 


mito^carr 


Mitochondrial carrier proteins 


2e-6l 


213 .5 


534 


thiolase 


Thiolase 


3.Se-183 


622 .0 


535 


FMO-like 


Flavin-binding monooxygenase- 
like 


0 


1153 .7 


536 


SCAN 


SCAN domain 


4e-55 


196 .6 


537 


tRNA-synt_l 


tRNA synthetases class I <I, L, 
M and V) 


3.1e-l36 


466 .0 


53 8 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3.1e-136 


466.0 


539 


tRNA-synt_l 


tRNA synthetases class I (I, I*, 
M and V) 


1.9e-117 


403 .6 


540 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3.1e-136 


466 .0 


541 


vATP-synt_E 


ATP synthase {E/31 kDa) subunit 


5.9e-B5 


29S.7 


543 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-£?9 


242 . 6 


544 


DUF101 


Protein of unknown function 
DUF101 


8.5e-38 


13 9.0 


545 


TGFbjpropep t 
ide 


TGF-beta propeptide 


1 .le-67 


238 .2 


547 


WD40 


WD domain, G-beta repeat 


2.6e-32 


120.8 


548 


RHD 


Rel homology domain (RHD) . 


1.6e-238 


686 .2 


549 


MMR_HSR1 


GTPase of unknown function 


S.4e-67 


236 .0 


551 


HBCT 


HECT-domain (ubiguitin- 
transferase) . 


4 .3e-127 


435 .6 


554 


MHC_II_alpha 


Class II histocompatibility 
antigen, alp 


3 .5e-74 


259.8 


555 


zf-UBRl 


Putative zinc finger in N- 
recognin 


3 .3e-16 


67.3 


556 


Kelch 


Kelch motif 


5.5e-29 


109.7 


561 


AMP -binding 


AMP-binding enzyme 


2.8e-06 


-163 .7 


562 


PABP 


Poly- adenylate binding protein, 
unique domai 


4 .9e-38 


13 9.8 


564 


Gag_p30 


Gag P30 core shell protein 


1.2e-67 


238 .2 


566 


PWWP 


PWWP domain 


8-le-16 


66.0 


567 


SCAN 


SCAN domain 


7.3e-68 


238 .9 


569 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 


570 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 


571 


CN_hydralase 


Carbon- nitrogen hydrolase 


0 .00081 


-79.7 


572 


my os i n_he a d 


Myosin head (motor domain) 


0 


1495.2 


573 


myosin__head 


Myosin head (motor domain) 


0 


1490.4 


575 


Surp 


Surp module 


1 .7e-23 


91. S 


576 


Surp 


Surp module 


1 .7e-23 


91. 5 


577 


DNA_pol_B 


DNA polymerase family B 


0 


1138.6 


578 


PDZ 


PDZ domain (Also known as DHR 
or GIjGF) . 


8 .3e-09 


42.7 


579 


LRR 


Leucine Rich Repeat 


4.9e-21 


83.3 


580 


neur_chan 


Neurotransmitter-gated ion- 
channel 


5.9e-177 


601 .3 


583 


sushi 


Sushi domain (SCR repeat) 


0 


1673.0 


584 


DEAD 


DEAD/DEAH box helicase 


7.3e-36 


116 .3 


586 


KH- domain 


KH domain 


2_9e-13 


57 .5 


587 


G-patch 


G-patch domain 


2.3e-14 


61.2 


589 


LIM 


LIM domain containing proteins 


2.3e-36 


133. 4 


590 


bromodomain 


Bromodomain 


6.6e-32 


114 . 7 


591 


broraodomain 


Bromodoma i n 


6.6e-32 


114 .7 
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PFAM 
SCORE 


592 


hormone_rec 


Ligand- binding domain of 
nuclear hormone 


3 . 5e-22 


87.1 


593 


PHD 


PHD- finger 


.3.8e-12 


53 .8 


594 


cadherin 


Cadherin domain 


4.2e-99 


342.7 


596 


pkinase 


Eukaryotic protein kinase 
domain 


5e-92 


319 .2 


597 


WD4 0 


WD domain, G-beta repeat 


0.00054 


26 .7 


600 


FG - GAP 


FG-GAP repeat 


4.3e-75 


262.9 


602 


G_Adapt_CT 


Gamma -adapt in, C- terminus 


l.le-53 


191.8 


603 


pkinase 


Eukaryotic protein kinase 
domain 


2 .3e-86 


300.4 


605 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8e-42 


152.4 


606 


mi to carr 


Mitochondrial carrier proteins 


6.3e-67 


"232.3 . 


608 


PWWP 


PWWP domain 


2.6e-28 


107.5 


609 


PWWP 


PWWP domain 


2.6e-28 


107.5 


613 


CAP_GLY 


CAP-Gly domain 


0.0046 


20.1 


615 


RFX_DNA_bind 
ing 


RFX DNA-binding domain 


5. 2e-54 


192.9 


616 


kinesin 


Kinesin motor domain 


l.le-81 


284.8 


617 


kinesin 


Kinesin motor domain 


8.4e-80 


278 .5 


618 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.0098 


13 .1 


620 


MATH 


MATH domain 


7.8e-0S 


22 .2 


621 


Y_phosphatas 
e 


Protein- tyrosine phosphatase 


1.4e-32 


121.6 


622 


pkinase 


Eukaryotic protein kinase 
domain 


4 .4e-40 


146 .6 


623 


BNR 


BNR repeat 


2.1e-ll 


51.3 


624 


raolybdopteri 
n 


Prokaryotic molybdopterin 
oxidoreductas 


1.4e-12 


42.2 


625 


TPR 


TPR Domain 


l.le-17 


72.2 


627 


cNMP_binding 


Cyclic nucleotide -binding 
domain 


3 .7e-58 


206 .6 


630 


adh_short 


short chain dehydrogenase 


5e-17 


70.0 


631 


zf -C2H2 


Zinc finger, C2H2 type 


2 . le-88 


307.1 


632 


rrm 


RNA recognition motif . 


4e-05 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


1 . 6e-104 


360.7 


636 


Fork_head 


Fork head domain 


5 . 9e-27 


103.0 


637 


pkinase 


Eukaryotic protein kinase 
domain 


3 .8e-70 


246.5 


642 


TPR 


TPR Domain 


4 .8e-08 


40.1 


643 


efhand j 


EF hand 


1.9e-27 


104.6 


647 


SNF2__N 


SNF2 and others N- terminal 
domain 


1 .2e-101 


351.1 


64 B 


PseudoU synt 
h 2 


RNA pseudouridylate synthase 


1.9e-55 


197.6 


650 


zf-C2H2 


Zinc finger, C2H2 type 


0.0087 


22.7 


651 


ank 


Ank repeat 


1 .3e-17 


71.9 


652 


I_LWEQ 


I/LWEQ domain 


9.5e-101 


341.0 


653 


neur_chan 


Neurotransmitter-gated ion- 
channel 


4.1e-171 


581.8 


654 


tsp_i 


Thrombospondin type l domain 


4.1e-47 


169.9 


659 


FH2 


Formin Homology 2 Domain 


le-107 


371.2 


661 


pou 


Pou domain - N- terminal to 
homeobox domain 


5.3e-45 


162.9 


662 


C2 | 


C2 domain 


6.7e-19 


76.2 


663 


C2 


C2 domain 


6.7e-19 


76.2 


664 


C2 


C2 domain 


6.7e-19 


76 .2 


667 


GST 


Glutathione S-transf erases . 


9.3e-34 


114 .4 


668 


LRR 


Leucine Rich Repeat 


9.3e-3l 


115.6 


670 


spectrin 


Spectrin repeat 


4e-S7 


203.2 


671 


I_LWEQ 


I/LWEQ domain 


9.5e-l01 


341.0 


672 


ABC tran 


ABC transporter 


5.3e-60 


212.8 


674 


WD40 r 


WD domain, G-beta repeat 


4.8e-24 


93 .3 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


675 


WD40 


WD domain, G-bena repeat 


4 .8e-24 


93 .3 


676 


LRR 


Leucine Rich. Repeat 


0.0015 


25.2 


679 


zf-CCCH 


Zinc finger C-x3-C-x5-C-x3-H 
type 


2.6e-29 


107 .7 


680 


zf-C2H2 


Zinc finger, C2H2 type 


5.2e-05 


30.1 


681 


CH 


Calponin homology (CH) domain 


2 .4e-17 


71 .1 


682 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


4 -3e-43 


156.6 


6B3 | 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.O51 


10 .8 


687 


Synapsin 


Synapsin 


0 


1890.8 


689 


PR55 


Protein phosphatase 2A 
regulatory subunit PR 


0 


1038 .8 


691 


homeobox 


Homeobox domain 


8.5e-30 


112.4 


696 


Peptidase_M2 
4 


metallopeptidase family M24 


2.6e~S9 


210.5 


697 


RhoGEF 


RhoGEF domain 


9.5e-35 


128.9 


698 


PHD 


PHD- finger 


0 .008 


9.3 


701 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-123 


422 . 0 


702 


Sulf atase 


Sulf atase 


3e-231 


781.6 


703 


zf-C2H2 


Zinc finger, C2H2 type 


5.7e-20 


79 .8 


707 


Acyl_transf 


Acyl transferase domain 


1 . le-22 


88 .8 


708 


WD4 0 


WD domain, G-beta repeat 


4 .8e-l9 


76.7 


710 


Ran_BPl 


RanBPl domain. 


8 .4e-06 


-7.3 


713 


DEAD 


DEAD/DEAH box helicase 


9 .9e-42 


134.9 


714 


PH 


PH domain 


1 ,6e-09 


39 .0 


715 


DSPc 


Dual specificity phosphatase , 
catalytic doma 


1 ,5e-37 


138 .2 


717 


Sialyl trans £ 


Sialyl transferase family 


7 .5e-31 


115.9 


718 




Immunoglobulin domain 


le-29 


100.8 


719 


integrin__B 


Integrins, beta chain 


0 


1125.4 


720 


zf -C3HC4 


Zinc finaer C3HC4 tvr>e (RING 
finger) 


1 .le-08 


32.4 


722 


Peptidase__C2 


Calpain family cysteine 
protease 


3e-14S 


495.9 


723 


j-g 


Immunoglobulin domain 


2 .2e-05 


22.4 


724 


F-box 


F— box domain. 


0 .007 


23 . 0 


725 




Putative snoRNA binding domain 


8 .le-58 


205 . 5 


726 


Nop 


Putative snoRNA binding domain 


8 . le-58 


205.5 


727 


WD40 


WD domain, G-beta repeat 


7 .5e-26 


99.3 


730 


ds nu 


Double- stranded RNA binding 
motif 


0 . 027 


12.1 


731 


dynamin 


Dynamin family 


4 .2e-16 


66.9 


733 


2f-CCCH 


Zinc finger C-x8-C-x5-C-x3 -H 
type 


2 .8e-10 


41.7 


735 


CDP- 

OH_P_transf 


CDP-alcohol 

pho spha t i dy 1 1 ran sferase 


4 .2e-26 


100.1 


738 


DEAD 


DEAD/DEAH box helicase 


8.6e-57 


182.5 


739 


TSC22 


TSC-22/dip/bun family 


6.5e-32 


119.5 


742 


ras 


Ras family 


2.2e-100 


346.9 


743 


PMI_typeI 


Phosphomannosc i some rase type I 


1.2e-243 


822 .9 


747 


trypsin 


Trypsin 


6 .4e-88 


279.4 


748 


kazal 


Kazal -type serine protease 
inhibitor domain 


2.2e-52 


187 .4 


749 


ef hand 


EF hand 


6.3e-06 


33.1 


751 


PHD 


PHD- finger 


4 .9e-16 


66 .7 


752 


zf-C2H2 


Zinc finger, C2H2 type 


3 .2e-21 


83 . 9 


753 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


6.1e-ll 


49. 8 


754 


Ribosomal_L3 
9 


Ribo3omal L3 9 protein 


0.00018 


26.7 


755 


PH ' 


PH domain 


3 -6e-14 


55.7 


758 


SCAN 


SCAN domain 


1.4e-53 


191.5 


759 


PA 


PA domain 


0 .0065 


23.1 


760 


arf 


ADP-ribosylation factor family 


2.2e-l9 


77.8 


761 


CIDE-N 


CIDE-N domain 


2.2e-40 


147.6 ! 
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762 


hi stone 


Core histone H2A/H2B/H3/H4 


9 . 9e-53 


188 .6 


763 


z±-MYND 


MYND finger 


4 . le-14 


60.3 


764 


pou 


Pou domain - N- terminal to 
homeobox domain 


le-52 


188 .6 


"to 


vwc 


von Willebrand factor type C 
domain 


2 .9e-34 


127.3 


769 


ef hand 


EF hand 


4 .8e-ll 


50.1 


770 
-— — 


zf -C4 


Zinc finger, C4 type (two 
domains) 


2 .4e-53 


181.6 


772 


ras 


Has family 


7e-90 


312 .0 


Til 


Sulf atase 


Sulf atase 


le-142 


487.5 


77S 


zf-C2H2 


Zinc finger, C2H2 type 


1 .le-12 


55.5 1 


776 


zf -C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


117 


zf -C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


778 


rrm 


RNA recognition motif. 


2.1e-32 


121 .1 


779 

i 


G6PD 


Glucose- 6 -phosphate 
dehydrogenase 


l.5e-76 


236.6 


■ 780 


spectrin 


Spectrin repeat 


3 .7e-29 


110.3 


781 


mito carr 


Mitochondrial carrier proteins 


4 .6e-57 


198 .5 


782 


SCAN 


SCAN domain 


1.3e-24 


95.2 


783 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


4 .le-07 


37.1 


785 


DEAD 


DEAD/DEAH box helicase 


6e-06 


21 . 7 


786 


ras 


Ras family 


5 .3e-39 


143 .0 


787 


RNase HII 


Ribonuclcase HII 


2.5e-G7 


237.1 


790 


PI3_Pl4_kina 
se 


Phosphatidyl inositol 3- and 4- 
kinases 1 


5 .4e-108 


372.2 


795 


cadherin 


Cadherin domain 


2 .5e-40 


147 .4 


796 


ARID 


ARID DNA binding domain 


1 .6e-20 


81. 6 


797 


trypsin 


Trypsin 


9.9e-20 


64. 8 


799 


CH 


Calponin homology (CH) domain 


3 .7e-15 


63 . 8 


801 


Gal- 

bind_lectin 


Vertebrate galactoside-binding 
lectin 


4.1e-25 


88 . 7 


803 


WD40 


WD domain, G-beta repeat 


0.00082 


26.1 


806 


TBC 


TBC domain 


1.8e-26 


101.4 


807 


TBC 


TBC domain 


1.8e-26 


101.4 


808 


CN_hydrolase 


Carbon-nitrogen hydrolase 


8.8e-80 


278.5 


811 


1 CBFD.NFYB HM 
F 


Histone- like transcription 
factor 


6e-14 


59.8 


812 


adh_short 


short chain dehydrogenase 


8 .le-20 


~79.3 


814 


IMP4 j 


Domain of unknown function 


3 .3e-71 


250. 0 


815 


zf-C2H2 


Zinc finger, C2H2 type 


B. 2e-66 


232 . 1 


816 


Pept_tRNA_hy 
dro 


Peptidyl-tRNA hydrolase 


l.6e-37 


138. 0 


817 


ARID 


ARID DNA binding domain 


2.5e-18 


74 .3 


826 


IFS_eIF4 elF 
2 


el F4 - gamma / e I F5/e I F2 - eps i 1 on 


1. 6e-32 


'121.5 ■ 


830 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


1.5e-53 


191.3 


831 


LRR 


Leucine Rich Repeat 


2.1e-26 


101.1 


832 


laminin_EGF ! 


Lammin EGF-like (Domains III j 
and V) 


2e-57 


204 .2 


839 


rrm 


RNA recognition motif. 


1.3e-22 


88.5 


840 


Y_phosphatas 
e 


Protein- tyrosine phosphatase 


2.6Q-119 


409! 8 


841 


pkinase 


Eukaryotic protein kinase 


3.4e-100 


346 .3 


844 


Ribosomal L2 i 
2e 


Ribosomal L22e protein family 


le-^4 


228.4 


846 


IBR 


IBR domain 


9e-15 


62.5 


849 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


850 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00016 


18.9 


851 


SET 


SET domain 


5e-30 


113.2 


852 


SRCR 


Scavenger receptor cysteine - 


0 


1025.4 
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rich domain 






853 


SRCR 


Scavenger receptor cysteine - 
rich domain 


0 


1025 .4 


857 


lactamase B 


Metallo-beta- lactamase 
super family 


0.012 


-6.0 


B58 


COX6A 


Cytochrome c oxidase subunit 
Via 


3 .4e-58 


206.7 


B59 


rrm 


RNA recognition motif. 


5.4e-45 


162.9 


861 


PRK 


Phosphoribulokinase 


5.1e-62 


219.4 


863 


mito_carr 


Mitochondrial carrier proteins 


2.9e-S3 - 


185.5 


864 


HSP90 


Hsp90 protein 


4 .7e-158 


538.5 


866 


19 


Immunoglobulin domain 


4e-12 


44 .1 


867 


zf-C2H2 


Zinc finger, C2H2 type 


7e-135 


461.5 


872 


hi stone 


Core histone H2A/H2B/H3/H4 


4 .9e-41 


149.8 


874 


CPSase_L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


2 .le-218 


739.0 


879 


R ibos oma 1 _S 1 
2e 


Ribosomal protein S12e 


2.1e-98 


340.3 


S82 


serpin 


Serpins (serine protease 
inhibitors) 


2_5e-42 


145.7 


883 


Patatin 


Patatin 


1 .2e-51 


182 . 0 


884 


RA 


Ras association (RalGDS/AF-6) 
domain 


0 .044 


8.0 


887 


DUF92 


Integral membrane protein DUF92 


2.7e~12 


54 .3 


889 


sugar_tr 


Sugar (and other) transporter 


8 .2e-63 


222 .1 


893 


DUF28 


Domain of unknown function 
DUF28 


1.3e-43 


158.3 


896 


IP_trarxs 


Phosphatidyl inositol transfer 
protein 


6 .5e-98 


338.7 


8 98 


DEAD 


DEAD/DEAH box helicase 


1. 5e-48 


156.5 


899 


KE2 


KE2 family protein 


7e-61 


215.7 


900 


KE2 


KE2 family protein 


4.3e-51 


183.2 


90X 


zf -C2H2 


Zinc finger, C2H2 type 


2 .7e-57 


203.8 


902 


ras 


Ras family 


2.3e-75 


263.8 


904 


TPR 


TPR Domain 


3.2e-22 


87.2 


906 


GBP 


Guanylate -binding protein 


8.9e-253 


853 .1 


907 


GBP 


Guanylate-binding protein 


l.le-239 


809.6 


908 


WD40 


WD domain, G-beta repeat 


2.6e-26 


100 .8 


909 


PH 


PH domain 


1 .3e-09 


39.4 


910 


z£-C2H2 


Zinc finger, C2H2 type 


2.5e-39 


144 .1 


913 


Epimerase 


NAD dependent 

epimerase /dehydratase family 


5e-07 


-88.5 


921 


TBC 


TBC domain 


1 .5e-09 


30.7 


922 


WD4 0 


WD domain, G-beta repeat 


1.6e-25 


9B.2 


923 


WD 4 0 


WD domain, G-beta repeat 


8.2e-07 


36.1 


924 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


2.9e-05 


29.1 


925 


UQ con 


Ubiqui tin -conjugating enzyme 


0.00033 


-27.6 


926 


CH 


Calponin homology (CH) domain 


3 .3e-53 


190.2 


928 


WD40 


WD domain, G-beta repeat 


5 .9e-48 


172.7 


929 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3 .le-10 


37.4 


930 


Ribul_P_3_ep 
ira 


Ribulose- phosphate 3 epimerase 
family 


7 .2e-105 


361.8 


931 


Ribul_P_3 ep 
ira 


Ribulose -phosphate 3 epimerase 
family 


1.2e-96 


334 .4 


936 




C2 domain 


2 .2e-62 


220.7 


937 


NAP_family 


Nueleosome assembly protein 
(NAP) 


l.le-22 


84. £ 


940 


abhydrolase 


alpha/beta hydrolase fold 


0 . 011 


3 .1 


944 


Tropomyosin 


Tropomyosins 


3 .2e-07 


25. 1 


948 


pkinase 


Eukaryotic protein kinase 
domain 


3 .4e-75 


263 .2 


949 


WD4 0 


WD domain, G-beta repeat 


1. 8e-27 


104 .7 


950 


Acyl transfer 
ase 


Acyl transferase 


1.6e-07 


38.4 
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SEQ XD 
NO : 


PPAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


951 




SAM domain (Sterile alpha 
mot i f ) 


0. 014 


14 . 5 


954 


d jro t nu a 

\jc\J -1 Url MOCA 


Oxidoreductase family 


1.3e-ll 


52 .0 


955 


BTB 


BTB/POZ domain 


7e-22 


86.1 1 


956 


BTB 


HiH/poz domain 


7e-22 


86.1 


957 


CDP- ' " "' 
OH P tranpf 


tUF-aiconol 

phosphatidyl transferase 


0.053 


-22.2 


959 


ras 


Ras family 


2.4e-97 


336.8 


960 




Ras family 


8.4e-43 


155 . 6 


961 


Acetyl transf 


Acetyl transf erase (GNAT) family 


1.2e-08 


42 .2 


962 




short chain dehydrogenase 


2.4e-31 


117 . 6 


963 


mutT 


Bacterial mutT protein 


5.6e-06 


26.2 


969 . 


IF-2B 


Initiation factor 2 subunit 
family 


8 .4e-193 


653.9 


y / w 


RNase PH 


3' exoribonuclease family 


9e-24 


92.4 


975 


WW 


WW domain 


5.7e-25 


96.4 


977 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


3 .6e-21 


83.7 


978 


Ribosomal Li 
7 


Ribosomal protein JL17 


2 .4e-20 


81.0 


979 


LIM 


LIM domain containing proteins 


5.8e-42 


152.8 


980 

qqo 


Caleequestri 
n 


Calsequestrin 


1 . 7e-297 


1001.7 




HSP20 


Hsp20/alpha crystallin family 


1 .2e-10 


43.2 


983 


oxidored_q6 


NADH ubiquinone oxidoreductase, 
20 Kd sub 


4 .8e-63 


222.9 


988 


TBC 


TBC domain 


2.2e-50 


180.8 


989 


TBC 


TBC domain 


2.2e-50 


180.8 


993 


tRNA_int_end 
o 


tRNA intron endonuclease 


0.0017 


-34.2 


994 


nomeobox 


Homeobox domain 


4e-18 


73 .6 


997 


pyr_redox 


Pyridine nucleotide- disulphide 
oxidoreducta 


0 .012 


11.6 


1000 


mito carr 
~oa — = 


Mitochondrial carrier proteins 


9 .7e-123 


421.2 


1001 


RA 


Ras association (RalGDS/AF-6 ) 
domain 


l : 2e-15 


65.4 


1004 


• DUF81 


Domain of unknown function 
DUF81 


0 . 099 


10.2 


1005 


act in 


Ac tin 


1 .3e-174 


574.3 


1006 


act in 


Actin 


3 .le-130 


428.6 


10 07 


cpn60 TCP1 


TCP-i/cpn6 0 chaperonin family 


3 .7e-195 


661.8 


1008 


TPR 


TPR Domain 


8 . le-44 


159.0 


1009 


2f _ C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216.6 


1011 


z£-C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216.6 


1012 




Zinc finger, C3HC4 type (RING 
finger) 


4 .7e-15 


53.1 


1016 


unnA.- sync «i c 


tRNA synthetases class II (A) 


2.3e-15 


55.2 


1018 


RhoRAP ' 


RhoGAP domain 


1.6e-78 


274 .3 


1022 


PGAM 


Phosphogly cerate mutase family 


3 .8e-18 


69. 7 


1026 


niwjij XJOX. 


HMG (high mobility group) box 


8 .4e-20 


79 .2 


1027 


TBC 


TBC domain 


7.3e-45 


162.5 


1028 


UQ con 


Dbiqui tin- conjugating enzyme 


1 .4e-49 


178.1 


1032 




PDZ domain (Also known as DHR 
or GliGF) . 


0 .028 


16.3 


1034 


nyuroiase 


naloacid deftaiogenasc-like 
hydrolase 


2e-21 


84.6 


1037 


KRAB 


KRAB box 


4 . Be-06 


32.4 


1038 


Cation_eff lu 

X 


Cation efflux family 


7.1e-42 


152 .5 


1040 


ART 


NAD:arginine ADP- 
ribosyltransf erase 


4.7e-47 


169.1 


1042 


WD40 


WD domain, G-bets repeat 


1.9e-l8 


74.7 


1043 


zf-C2H2 


Zinc finger, C2H2 type 


3 . 7e-24 


93 . 7 


1045 


lectin c 


Lectin C- type domain 


1.9e-28 


108 .0 


1046 


Glucosamine^ 
iso 


Glucosamine - 6 -phosphate 
isomerase 


0.00013 J 


-25.1 
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SEQ ID 
NOr 


PFAM NAME 


DESCRI PTION 


p- value 


PFAM 
SCORE 


1047 


ligase-CoA 


CoA-ligases 


4.5e-80 


279 .4 


1049 


lg 


Immunoglobulin domain 


1.7e-09 


35.6 


1050 


Ribosomal L2 
4e 


Ribosomal protein L24e 


2e-33 


124 .5 


1054 


Amidase 


Amidase 


4.3e-lS2 


518 .7 


1055 


rrra 


RNA recognition motif. 


3 .8e-26 


100 .3 


1058 


annexm 


Annex in 


6.9e-44 


159.2 


1059 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


0.023 


-23.6 


1060 


homeobox 


Homeobox domain 


3.2e-31 


117.2 


1062 


Acyl transfer 
ase 


Acyl transferase 


0.00065 


10.5 


1064 


AMP-binding 


AMP-binding enzyme 


6.6e-100 


345.3 


1065 


LRR 


Leucine Rich Repeat 


3.3e-14 


60.6 


1066 


GTPl_OBG 


GTP1/OBG family 


4.Be-41 


141.8 


1071 


lg 


Immunoglobulin domain 


8 . 4e-4 8 


159.1 


1072 


PHD 


PHD- finger 


6. Be-07 


36.3 


1074 


DENN 


DENN (AEX-3) domain 


8.3e-33 


121.5 


1075 


SCP 


SCP-like extracellular protein 


4 .7e-41 


149-8 


1077 


OIjF 


01 f actomedin— 1 iJce domain 


2 .2e-66 


234 .0 


1078 


mito_carr 


Mitochondrial carrier proteins 


le-42 


149.3 


1079 


WD4 0 


if xJ lAkJULd 111 t \J yCLB 1- w t> 


6 . 2e-45 


162 .7 


1007 


START 




2 _ 5e-4 8 


174 .7 


1093 


DSPC 


Dual specificity phosphatase, 

LdLaiy 1 uuiua 


3 .3e-63 


223 .4 


1094 


GSHPx 


nil ti h a l-Vi 1 nn^ n^vnvi (Iacaq 


9 . 6e-41 


148 . 8 


1095 


DUF25 


Domain of unknown function 

UUP £j 


2e-75 


264 .0 


10 96 




, DUF2 5 


uoiiici xri \jx umtnowii luiili-iuii 
DUF25 


6e- 75 


262 . 4 


1105 


N i t roreduc t a 


Nitroreductase family 


1 .3e-13 


58 .6 


1106 


PTE 


Phosphotri est erase family 


1 .3e-179 


610.1 


J.XU / 


DAGKc 




0 . 00049 


19.6 


1109 


ras 


Ras family 


1 .3e-15 


40.7 


ii 1 c 
XX JL» 


Ar £Gap 


protein for Ar f 


9 . 7e-47 


168 . 7 


1116 


HMG14 17 


HMG14 and HMG17 


4 .4e-21 


83 .5 


1117 


HMG14 17 


HMG14 and HMG17 


9 . 9e-12 


52 .4 


1119 


FAA hy dr ol a s 
e 


Fumarylace t oacetate (FAA) 
hydrolase fam 


2e-83 


290 . 6 


1120 


pkinase 


Eukaryotic protein kinase 
domain 


1 ,4e~94 


327.6 


1123 


abhy dr ola s e 


alpha /bet a hydrolase fold 


9.2e-23 


89.0 


1129 


pro i some ras 
e 


Cyclophilin type peptidyl- 
prolyl cis-tr 


2 .2e-56 


197.1 


1131 


Dna J 


Dna J domain 


1.6e-30 


114 .9 


1132 


WD40 


WD domain, G-beta repeat 


1.3e-19 


78 .6 


1133 


WD40 


WD domain, G-beta repeat 


1 .8e-15 


64 .9 


1134 


PH 


PH domain 


0. 0015 


17.8 


1136 


Adap comp gu 
b *" 


Adaptor complexes medium 
subunit family 


1.2e-256 


866.0 


1137 


Adap comp su 
b 


Adaptor complexes medium 
subunit family 


2.5e-209 


708 . 8 


1139 


ras 


Ras family 


1.5e-86 


301.0 


1141 


pkinase 


Eukaryotic protein kinase 
domain 


9.4e-74 


258.4 


1152 


Acyl transfer 
ase 


Acyl transferase 


1 .2e-05 


29.9 


1153 


IRS 


PTB domain (IRS-l.type) 


5.4e-55 


196.1 


1155 


ig 


Immunoglobulin domain 


1.3e-31 


106 . 9 


1157 


Asparaginase 
2 


Asparaginase 


6.4e-72 


252.3 


1159 


GMC_oxred 


GMC oxidoreductases 


4.7e-142 


485.3 


1160 


zf-ANl 


ANl-like Zinc finger 


0.00021 


27.9 
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SEQ XD 
NO : 


PFAM name 
- . , «— 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1163 


1 inker nisto 

ne 


lmKer histone Hi and. H5 family 


3 .8e-14 


60 .4 


1164 


DED 




3 . 9e-05 


30 . 5 


1165 


IRS 


PTB domain {IRS-1 type) 


2.6e-43 


157.3 "T 


1166 


IRS 


r ID ClonidHJ 1 J-Ki3 — JL type) 


2 . 6e-43 


157 .3 


1168 


SAM 


orttTi uomain isieriiE axpna 
moti f ) 


0 . 04 


10 . 5 


1170 


abhvdyol a cp 


dxpud / DtfL-o AiyaJT OlaSc LOjLCL 


0 . 098 


-7 . 5 


1174 


SAP 


O ?V "Tj /^omn *» 


3 - 9e-10 


47 . 1 


1177 


PP2C 


Protein phosphatase 2C 


5.3e-31 


112 .5 


1178 


WD40 


WD domain, G— beta repeat 


4 . 7e-35 


129.9 


1180 


Ets 


Ets-domain 


1. 8e-09 


33 .3 


1181 




Collagen triple helix repeat 


0. 00016 


24 . 7 


1182 


TCL1 MTCP1 


TCL1/MTCP1 family 


9.5e-56 


198 .6 


1184 




RasGEF doma i n 


1.7e-88 


307 .4 


1185 


mito carr 


Mitochondrial carrier proteins 


1.5e-62 


217.3 


1187 


IinRTi T "ITS' 

UPAR L»Y6 


u-PAR/JL.y-6 domain 


0.0042 


15.6 


1188 


Orn_DAP_Arg_ 
deC 


Pyridoxal- dependent 
dec a rb oxyl a s e 


6.2e-128 


430.6 




Stathmin 


Stathmin family 


1. 8e-90 


314 .0 


1194 


Stathmin 


Stathmin family 


1 . 0e-90 


314 .0 


1195 


Seel 


Seel family 


3.2e-183 


622.1 


1196 


pyr__redox 


Pyridine nucleotide -disulphide 
oxidoreducta 


3.1e-32 


111.8 ] 


1197 


Glyco_transf 
_8 


Glycosyl transferase family 8 


1.2e-09 


45.5 


1202 


K_tetra 


K+ channel tetramerisation 
domain 


0.022 


-16.8 


1203 


adh_short 


short chain dehydrogenase 


8.3e-45 


162.3 


1206 


Ub i e^_rae t hy 1 1 
ran 


ubiE/COQ5 methyl transferase 
family 


1.3e-121 


417.4 


1208 


7 Lin 3 


7 transmembrane receptor 


7.2e-09 


29 . 0 


12 09 




Ank repeat 


3.9e~15 


63 .7 


1210 


vATP— 
synu >\v — 3 y 


ATP synthase (C/AC39) subunit 


2.5e-128 


439 .7 


1212 


zf - C2H2 


Zinc f inger , C2H2 type 


5. 5e-17 


69 .9 "~ 


1213 


ef hand 


EF hand 


3.2e-07 


37.4 


1219 




RNA recognition motif. 


2.1e-40 


147.7 


1220 


c — 

DUF6 


Integral membrane protein DUF6 


0.015 


21.5 


1222 


SCAN 


SCAN domain 


1.5e-71 


251.1 


1223 " 


G - gamma 


GGi» domain 


3.6e-36 


129.5 


1227 


catalase 


Catalase 


0 


1158.9 




PX 


PX domain 


2.2e-l5 


64.5 


1233 


~px 


PX domain 


2.2e-15 


64.5 


1236 




Fes/CIP4 homology domain 


3.3e-09 


44 .0 


1241 


feptiaase M2 
0 


Peptidase family M20/M25/M40 


2e-63 


224 .1 


1243 


' 


ww aomam 


0. 044 


17.9 


1247 


UPF0006 


Metalloenzyme of unknown 
iunccion upt-oooe 


6.3e-61 


215.8 


1248 


Glycos trans 
t 2 


Glycosyl transferases 


4.5e-10 


46.9 


1249 


ef hand 


EF hand 


4e-ll 


50.4 


1254 


con 


TtU-I ' t_ ' — 1 7—1 — 

una quit in -conjugating enzyme 


2.1e-73 


257.3 


1255 




Ras family 


2.2e-62 


220.7 


1256 


formyl trans 
f 


Formvl rrariQfpra^p 


4 . 9e-30 


108.3 


1259 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-13 


46.4 


1261 


DiHfolate re 
d 


Dihydrofolate reductase 


2.1e-69 


241.7 


1262 


G glu transp 
ept 


Gamma -glut amyl t ranspep t idase 


1.8e-110 


380.4 


1263 


PAS 


PAS domain 


1.3e-08 


36.9 


1265 


LRR 


Leucine Rich Repeat 


4.2e-22 


86.9 
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SEQ ID 
NO: 


PFAM NAME 


JlliO^rC J. ir J. -L\JiM 


p- value 


PFAM 
SCORE 


1266 


SCP 




6e-29 


108 .0 


1267 


K_tetra 


K+ channel tetramerisation 

QDHKIAII 


2.8e-27 


104.0 


1269 


ras 


pac f ami 1 v 


1.3e-85 


297.9 


— T^i 

1275 


ZI- 


7inc fincrer. C3HC4 tvpe (RING 
f inger ) 


4 .2e-10 


37.0 


1276 


abhydrolase 


alpha/beta hydrolase fold 


5.4e-23 


89.8 


1277 




alpha /beta hydrolase fold 


5.6e-21 


83.1 


1279 


urypsin 


Tryps i n 


4 -4e-41 


132.0 


1280 


PBP 


Phosphat idyle thanolantine - 
binding protein 


1.3e-l3 


58.7 


1285 


ZI v — J tlV— - 


2inc finger, C3HC4 type (RING 
finger ) 


5.6e-14 


49.6 


- 

1287 


anlc 


AnJt repeat 


1.7e-52 


187.8 


1294 


-C _ -j 


Fibronectin type III domain 


0.026 


20.9 


1295 


GBP 


Guanylate -binding protein 


0.00026 


-70.0 


1296 


PMP22 Claucil 
n 


/fmp/mp20 /Claudin family 


6.9e-41 


149.3 


1297 


Rhodanese l 


Rhodanese -like domain ] 


3.2e-14 


60.7 


1298 


L»IM 


T TM /4iMtiai n T1 l*ai niTlfl T33TOfceill3 


5. 8e-21 


79.1 


1301 


rnaseA 


Pancreatic ribonucleases 


4, 9e-43 


145.2 


13 07 


mito_carr 


MiCocnonoriai carrier pruLuina 


2 . le-53 


186.0 


1308 


WD40 


WD domain, t^-neca repeat 


1.6e-17 


71.6 


1310 


UPAR LY6 


u- PAR/Ly- 6 domain 


7 . le-20 


75 .5 


1313 


thiored 


Thioredoxin 


3.6e-05 


21.6 


1314 


Aa_trans 


Transmembrane amino acid 
transporter protein 


1 . 5e-67 


237 . 9 


1316 


trypsin 


Trypsin 


4 . 4e-41 


132 .0 


1320 


Riboe omaI_L 1 
3 


Ribosomal protein L13 


3.9e-62 


219.8 


1327 


Armadillo_se 
9 


Armadil lo/beta - ca t emn- 1 ike 
repeats 


0.0054 


23.4 


1328 


KRAB 


KRAB box 


0.052 


-5.6 


1329 


rrm 


RNA recognition motif. 


2 . le-40 


147.7 


1330 


Bcl-2 


Apoptosis regulator proteins, 
Bcl-2 family 


0.014 


-1.6 


1331 


PX 


PX domain 


2 . le-10 


48 . 0 


1333 


KRAB 


KRAB box 


1 , 8e-36 


134 . 6 


1334 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synfc 


2 3e-89 


310 . 3 


1335 


UP P_syn t he t a 
se 


Putative undecaprenyl 
diphosphate synt 


1 t Se-59 


211.0 


1336 


DSPC 


Dual specificity phosphatase. 


1.2e-3l 


118.6 


1337 


._ 

DSPc 


ratal vt'ic fiorrva 


2 .3e-12 


54.5 


133 8 


rH5 

TPR 


A. If M\> LJ lw/l HO. _L ZI 


0.00021 


28.1 


1340 


metalthio 


Metal lothionein 


0.013 


20.3 


1341 


mutT 




5 .8e-09 


36.5 


1343 


Band 41 


FERM domain (Band 4.1 family) 


1.3e-38 


122.5 


1344 


Keicn 




1 .4e-44 


161.5 


1345 






1.2e-10 


48.8 


134 7 


j tie l. a rii>J-> 


dehydrog ena s e / i s ome r a 


0.086 


-177.2 


134 8 


bla 




5.3e-28 


106.5 


1349 


DUF6 


Integral membrane protein DUF6 


0.033 


15.8 


1350 


myosin_head 


Myosin head (motor domain) 


0 


1088.7 


1352 


Nramp 


Natural resistance-associated 
macrophage pro 


1.2e-202 


686.6 


1353 


S_100 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


1355 


DEAD 


DEAD/DEAH box hel lease 


3.6e-65 


209.0 


1356 


C2 


C2 domain 


2.4e-15 


64.4 


1357 


RBD 


Raf-like Ras -binding domain 


4.2e-57 


203 .1 


1360 


Zf-C2H2 


Zinc finger, C2H2 type 


7.4e-141 


481.4 


1361 


HMG14 17 


HMG14 and HMG17 


7.9e-40 


145.7 
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SEQ ID 
NO: 


rc/VI WAMJc. 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1362 


SIS 


olo aOuiain 


3 . 8e-30 


113 . 6 


1363 
1364 


SIS 
ig 


SIS domain 

Iuunuiiog loJbul in domain 


1.3e-28 


108.5 


1368 


K tetra 


K+ channel tetramerisation 


0.00026 
1 .le-16 


19. 0 
68.9 


1371 


Collagen ~ 


v-uj.j.dyeu ctipie neux "repeat 
(20 copies) 


2 .2e-113 


390.1 


1372 


DnaJ 


uiiVLxj uviiiict 4.11 


6.6e-36 


132.7 " 


1376 


KRAB 


KRAB box 


2.1e-3 8 


141 . 0 


1378 


EL»M2 


ELM2 domain 


2e-23 


91.3 


1380 


thiored 


•L ii J- UI cUuAin 


1.2e-23 


82 .8 


1381 


ank 


i/v icpcau 


2.3e-83 


290.4 


1382 


BTB 


nia/t'u^ aomam 


3e-ll 


50.8 


13 83 


WD40 


WD domain, G-beta repeat 


1.6e-19 


78 .3 i 


1384 


WD40 


WD domain, G-beta repeat 


6.3e~24 


92.9 


1387 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger} 


1 . le-09 


35 . 6 


1389 
1390 


Zf -C2H2 
zf-C2H2 


Zinc finger, C2H2 type 
Zinc finger, C2H2 type 


5.5e-50 


179.5 


1393 


k i hpq i n 


Kincsin motor domain - " ~~ 


2.5e-85 
7.8e-188 


296 .9 
637.4 


1394 


zf -C2H2 


Zinc finger, C2H2 type 


1.2e-49 


178.4 


1398 


KRAB 


KRAB box 


5.1e-22 


86 .6 


1402 
1405 


bZI P 
sugar tr 


bZIP transcription factor 
Sugar land other) transporter 


0.035 


13 .1 


1406 
1407 


rrm 


RhoGAP domain 

RNA recognition motif. ~~ 


0 .003 
8.9e-47 


-101 .5 
168.8 


1408* 


JLiKK 


Leucine Rich Repeat 


le-35 
2.1e-13 


132 .1 
58 .0 


14 09"" 


Nebulin repe 
at 


Nebulan repeat 


6e-54 


192.6 


1410 


ank 


Ank repeat 


1 . 6e-17 


1 1 . t> 


1412 


Ribosomal L5 
__C 


ribosomai L5P family C- terminus 


8.2e-58 


205.5 


1415 




Trypsin 


4.7e-85 


270.4 


1416 


aminotran 1 


Aminotransferases class-I 


4 .4e-05 


-91.2 


1417 


SI 


SI RNA binding domain 


1. 6e-C7 


33 .1 


1419 


WD40 


WD domain, G-beta repeat 


2.2e-09 


44 .6 


1422 


cadherin 


Cadherin domain 


8 . 3e-42 


152 . 3 


1424 


SH3 


SH3 domain 


2.5e-80 


280.3 


1425 
1426 


PHD 
PHD 


PHD- finger 
PHD- finger 


3.2e-17 


70.6 


1427 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


3 .2e-17 
le-37 


70.6 
138. a 


1428 

1429 
1430 
1431 


helicase_C 
WD4 0 

inositol_P 

mi to carr " 


Helicases conserved C- terminal 
domain 

WD domain, G-beta repeat 
Inositol monophosphatase family 
Mitochondrial carrier proteins 


le-26 

3 . 9e-07 
2.5e-10 


102.2 

37.2 
40.2 


'1433 
1434 
1435 

1436 
1438 
1440 


Clq 
WD40 

inos-i- . 

P_synth 

rrm 

GAdapt_CT 


Clq domain 

WD domain, G-beta repeat 
Myo-inositol-l-phosphate 
synthase 

RNA recognition motif. 
Immunoglobulin domain 
Gamma -adapt in, C-terminus 


4 .3e-83 
2 .9e-16 
1 .6e-13 
7e-228 

1.4e-34 
1 .3e-12 
3.4e-67 


287.7 
66.2 
58 . 3 
770.4 

128.3 j 
45.6 
236 .7 


1441 
1443 
1446 
1447 
1448 


G_Adapt CT 

Kelch 

ARID 

zt-C2H2 

AMP-binding 


Gamma-adaptin, C-terminus 
Kelch motif 

ARID DNA binding domain 
zinc finger, C2H2 type 


3 .4e-67 

0 . 00013 

1 . 8e-21 
9.4e-28 


236.7 
28.7 
84.7 
105.6 


1451 " 

1454 

1455 

1460 

1461 

1470 

1472 


rrm 

ig *~ 
sialyl transf 
Aldose epim 

C2 

rxcj 

fseudoU synt 


AMP-binding en2yme 

RNA recognition motif. 

Immunoglobulin domain 

siaiyitransferase family 

Aldose l-epimerase 

C2 domain 

1PT/TIG domain 

kna pseudouridylate synthase 


2.6e-07 
6 .5e-2l 
5 .6e-44 
5 .4e-21 
1.9e-35 
4e-18 
3 .le-19 
4.3e-16 


-145.1 
82.9 
146.7 
83.2 

131.2 j 
73 .6 
77.3 
66.9 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 




h_2 








1474 


DENN 


DENN (AEX-3) domain 


1.3e-44 


161.6 


1475 


Cation_ef f lu 

X 


Cation efflux family 


4.6e-49 


176.4 


1477 


TBC 


TBC domain 


8e-47 


169 . 0 


1478 


rrtn 


RNA recognition motif. 


2e-21 


84.6 


1480 


ig 


Immunoglobulin domain 


5.5e-06 


24.3 


1484 


TeloJbind_al 
pha 


Telomere -binding protein alpha 
subuni 


0.028 


-225.9 


1485 


Zf-C2H2 


Zinc finger, C2H2 type 


1.8e-68 


240.9 


1486 


pkinase 


Eukaryotic protein kinase 
domain 


9.5e-13 


49.9 


1488 


helicase_C 


Helicases conserved C- terminal 
domain 


1.4e-15 


65.2 


1489 


DUF89 


Protein of unknown function 
DUF89 


0.079 


-132.4 


1490 


ECH 


Enoyl-CoA hydra tase/isomerase 
family 


S.2e-41 


149.7 


1491 


guanylate_cy 
c 


Adenylate and Guanylate cyclase 
catalyt 


5.9e-46 


166.1 


1492 


IiRR 


Leucine Rich Repeat 


3 .4e-l9 


77.2 


1495 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.1e-10 


36 .3 


1497 


pkinase 


Eukaryotic protein kinase 
domain 


le-22 


85.8 


1500 


SH3 


SH3 domain 


9.3e-05 


27.2 


1502 


homeobox 


Homeobox domain 


0.084 


13.8 


1503 


homeobox 


Homeobox domain 


0 . 084 


13.8 


1505 


EGF 


EGF- like domain 


2 ,7e-23 


90.8 


1506 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


2 . 7e-21 


84.2 


1508 


Peptidase M2 
0 


Peptidase family M20/M25/M40 


2 . 8e-28 


101. 8 


1511 


PX 


PX domain 


1 .9e-ll 


51.5 


1512 


Sulf atase 


Sulf atase 


2 . 8e-35 


130.7 


1516 


Syntaxin 


Syntaxin 


0 . 011 


-62 .3 


1518 


amino t ran 3 


Aminotransferases class- I II 
pyridoxal -pho 


9 .7e-106 


305.6 


1520 


ig 


Immunoglobulin domain 


0.075 


11.0 


1521 


RA 


Ras association lRalGDS/AF-6 ) 
domain 


0.013 


13.3 


1523 


RhoGAP 


RhoGAP domain 


2.5e-05 


18.7 


1528 


WD40 


WD domain, G-beta repeat 


5.4e-24 


93.1 


1535 


IMS 


impB/mucB/samB family 


7.8e-95 


328.5 


1538 


FYVE 


FYVE zinc finger 


3 ,2e-27 


101.5 


1539 


DAGKc 


Diacylglycerol kinase catalytic 
domain 


6e-07 


36.5 


1540 


Ocular_alb 


Ocular albinism type 1 protein 


0 


1184.7 


1653 


SAP 


SAP domain 


6e-06 


33 .2 


1654 


Am i no__oxida s 
e 


Flavin containing amine oxidase 


3.2e-43 


157.0 


1655 


Amin o_oxi da s 
e 


Flavin containing amine oxidase 


3.2e-43 


157.0 


1656 


RhoGEF 


RhoGEF domain 


1.4e-24 


95.1 


1657 


MMR HSR1 


GTPase of unknown function 


0.0011 


-45.5 


1659 


UCH-2 


Ubiquitin carboxyl - terminal 
hydrolase family 


2.5e-ll 


51.1 


1660 


act in 


Actin 


6.6e-21 


69.9 


1661 


BAH 


BAH domain 


1.7e-82 


287 .5 


1662 


vwa 


von Willebrand factor type A 
domain 


0 


1909.4 


1663 


WD40 


WD domain, G-beta repeat 


1.4e-67 


237 . 9 


1667 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-93 


324 .4 


1669 


Noll_Nop2_Su 
n 


NOLI / NOP2 / sun family 


1.3e-23 


84 .3 


1671 


SH2 


Src homology domain 2 


5.4e-15 


46.9 
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SEQ ID 
NO: 


PFAM NAME 


DF^PR TPTTtTki — 


p-value 


PFAM 
SCORB 


1672 


chrorao 


1 chromO ' ICHRrnmaf'ln 
will. Wiliw \ %_ UAi, uuia I. in 

Organization Modifier) 


2 .le-18 


67 . 7 


1674 


zf-CCCH 


Zinc fincrer C-x8-C-3c^-P-v*_ w 

type 


U . U02r> 


17 . 6 


1676 


Glyco hydro 
4 7 


™ Glycosyl hydrolase family 4 7 


1.8e-187 


636.2 


1677 


Glyco hydro 
47 


Glycosyl hydrolase family 47 


4.5e-74 


259.5 


1680 


WD4 0 


WD domain, G-beta repeat 


l.le-27 


105.5 


1681 


WD 4 0 


WD domain, G-beta repeat 


1 . le - 2 7 


105 . 5 


1683 


MMR_HSR1 


GTPase of unknown function 


1 . 8e- 78 


274 . 1 


1621 


rrrn 


RNA recognition motif. 


1 . 8e- 3 7 


137 . 9 


1632 


rrm 


RNA recognition motif. 


1 . 8e- 3 7 


137 . 9 


16S3 


AAA 


ATPases assoei»t-#»H ui t-H -»r=i-r--i n nc 
cellular act 


1 . 3e- 81 


284 . 5 


1637 


Ferric_reduc 
t 


Ferric reductase* l ivp 
transmembrane com 


8 -4e- 82 


285.2 


1638 


"Ferric reduc 
t 


Ferric reductase like 
t ran s membrane com 


3 . 5e-53 


190 . 1 


1693 


zf - C2H2 


Zinc fincrer. C2H2 tvnp 


4 . 4e- 34 


126 . 6 


1700 


"ar£ 


ADP-rxbosylation factor family 


9e-19 


75. 8 


1702 


GTP_EFTU 


El onorafc ion tactnv Tn p „ -: i . » 


0 . 014 


11 .4 


1703 


SCAN 


SCAN domain 


1.8e-54 


194 .4 


1707 


pkina.se 


ouR.ei£yotic protein Kinase 
domain 


1.2e-88 


307.9 


1709 


WD4 0 


wn-nuaxii, u-oeta repeat 


0 . D035 


24 .0 


1710 


LRR 


" - i., ^. i ic i^iLCii *vcpcau 


1 .2e-30 


115.3 


1711 


WW 


WW domain 


7.6e-12 


52. B 


1712 


ank 




4 .2e-34 


126.7 


1713 


zf -CCCH 


Zinc finger C-x8-C-x5-C-x3 -H 


2.6e-09 


38.3 


1714 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2.6e-09 


38.3 


1715 


ras 


*"io i.dulli.y 


4 .4e-41 


149.9 


1718 


HMG box 


HMG (high mobility group) box 


8 .3e-21 


82.6 


1719 


TBC 


J- -c>v_ uuitiaxil 


1. le-45 


165.2 


1721 


HLH 


Helix- loop- helix DNA-binding 

uwu id xxi 


9.2e-10 


45.9 


1723 


dsrm 


uuuuAc-st_anaea *cna Dinding 
motif 


2 .9e-05 


30.9 


1724 


RmaAD 


dime thy 1 as es 


0 . 045 


9.2 


1725 


CIDE-N 




5 . 9e-40 


146 .2 


1726 


HAT 


HAT (Hal f -A-TPR ) -<-P-ne>af*. 


2 . 9e-44 


160 . 5 


1728 


efhand 


EF hand 


5 . le-20 


79. 9 


1733 


Hist deacety 
1 


Hi stone deacetylase family ' 


1 . 7e— 104 


360 . 6 


1735 


LRR 


lieu cine Rich Repeat 


"± . DC — J*i 


126. - 6 


1739 


PI-PLC-X 


Phospha tidy linos itol- specific 
phpsphol ipa se 




16 . 1 


1743 


ras 


Ras family 


3 . 7e-10 


-21.3 


1744 


ras 


Ras family 


3 . 7e-10 


- ,3 


1745 


RasGEF 


RasGEF domain 


3.2e-49 


176.9 


1746 


aan snore 


short chain, dehydrogenase 


7.1e-08 


34.6 


1751 


zf-C2H2 


zinc finger, C2H2 type 


9e-3 9 


142.2 


1754 


f n3 


Fibronectin type III domain 


5.5e-101 


348. 9 


1756 
nco 


zr-C2H2 


Zinc finger, C2H2 type 


6.3e-93 


322.1 


J. fi>8 


rrra 


RNA recognition motif. 


0.017 


21.2 


1760 


Nop 


fucative snoRNA binding domain 


6.ie-95 


328.8 


1761 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328. 8 


1765 


MMR_HSR1 


GTPase of unknown function 


6.4e-41 


149. 4 


1769 
1775 


CN_hydrolas e 
ank 


Carbon -nitrogen hydrolase 
Ank repeat 


3e-06 


-43.9 


1779 
1783 
1784 


Oxysteroi BP 

RhoGEF 

RhoGBF 


oxysteroi -binding protein 
RhoGEF domain 
RhoGEF domain 


4.1e-07 
4.7e-56 
1.6e-23 
1.6e-23 ■ ■ 


37 .1 
199.6 
91.6 
91.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


X785 


rrm 


RNA recognition motif. 




59.7 



TRADOCS:1416227.1(%CRN0!!.DOC) 
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TABLE 5 





SEQ ID NO: 


I POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


SCORE) 


MeanS {MEAN 


1 


1-21 


0 .991 


0 . 955 


2 


1-31 


0.995 


0 . 944 


3 


1-33 


0 . 949 


0 . 736 


4 


1-19 


0.970 


0.951 


S 


1-26 


0 .971 


0.863 


6 


1-26 


0 .971 


0 . 863 


7 


1-26 


0 . 971 


0.863 


8 


1-26 


0.971 


0 . 863 


9 


1-46 


0.982 


0 . 901 


10 


1-21 


0 . 991 


0 . 955 


11 


1-23 


0 .989 


0 899 


12 


1-25 


0.95S 


0 . 8 03 


13 


1-18 


0 . 932 


U.OZD 


14 


1-18 


0 . 93 8 


V . O / O 


15 


1-25 


0 . 941 


0 .811 


16 


1-17 


0.972 


0 . 939 


17 


1-27 


0 . 964 


0 . 777 


18 


1-16 


0 . 914 


0 . 657 


19 


1-19 




0.840 


20 


1-20 


0 935 


0 . 701 


21 


1-22 


n o*7^ 

U . J l *± 


0.850 


22 


1-33 


r\ qc n 
V . yoj. 


0 .895 


23 


1-19 


0 . 991 


0 . 959 


24 


1-31 


0 - 995 


0 . 944 


25 


1-22 


0 . 976 


0 . 935 


26 


1-27 


0 . 996 


0 . 928 


27 


1-24 


0 . 953 


0 .739 


28 


1-21 


0 . 906 


0 . 668 


29 


1-31 


u . nab 


0 . 841 


30 


1-28 


n q q n 


0 . 893 


31 


1-19 


0 . 993 


0 . 976 


32 


1-22 




0.909 | 


35 


1-33 


0 . 94 9 


0 - 736 


36 


1-33 ' 


0.949 


0 . 736 


46 


1-19 


0 . 570 


0 . 951 


67 


1-25 


0 . 968 


0 . 848 


71 


1-18 


0 . 949 


0 . 845 


72 


1-3 0 


0.991 


0 . 919 


75 


1-29 


0.958 


0 . 854 


88 


1-20 


□ . g g 6 


n q^i c 

U . 


94 


1-33 


0 . 994 




97 


1-46 


0.964 


0.595 


103 


1-49 


0.983 


0 - 570 




108 


1-26 


0.978 


0 . 885 




111 


1-23 


0.989 


0 . 899 




126 


1-25 


0 . 955 


0 . 803 




129 


1-19 


0 . 963 


0 . 918 




138 


1-29 


0 .971 


0 . 844 




143 


1-18 


0.914 


0 . 628 




148 


1-20 


0. 969 


0 . 9 04 




156 


1-25 


0.941 


0 ~ 811 




158 


1-22 


0.979 


0.927 




160 


1-17 


0.972 


0.939 




161 


1-48 


0.903 


0.571 




162 


1-25 


0.937 


0.729 




168 


1-16 


0.939 


0. 826 




171 


1-27 


0.964 


0.777 




178 


1-21 


0 . 945 


0 . 825 




180 


1-27 


0.981 


0. 941 




187 


1-28 


0.982 


0.936 




190 


1-19 


0 . 953 


0.840 




196 


1-22 


0.975 


0.916 




197 


1-22 


0.963 


0.936 
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SEQ ID NO: 



199 



200 



206 



207 



POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 



MaxS (MAXIMUM 
SCORE) 



1-20 



1-23 



1-30 



1-19 



0.935 



0.977 



0.984 



0.990 
0.974 



MeanS (MEAN 
SCORE) 



0.701 



0.773 



0 .890 



0 .924 



208 



210 



1-22 



1-23 



0.940 



0.971 



0 . 850 
0 .670 



0 .849 
0.956 



216 



221 
~222 



1-24 



1-19 



1-19 



1-21 



0.986 



0 .961 



O.970 
0 .904 



0.917 



0 .895 



0.871 



0.553 



0.555 



230 



231 



232 



239 



1-26 



1-25 



1-23 



0 .991 



0.953 



0 .988 



0.969 



0 .959 



0.800 



0 .826 



0.828 



240 



0.982 



0.955 



245 



248 



249 



252 



1-17 



1-30 



1-22 



1^23" 



1-18 



.982 



0.970 



0.976 



0.968 



0.971 



0.955 



0.722 



0 .935 



0 .94 0 



0 .923 
0 .587 
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1-24 



0.883 



1-18 



0.939 



0 .868 
0 .739 



272 



283 



284 
290 



304 



312 



313 



1-24 



1-29 



1-31 



1-28 



1-16 



1-17 



0.953 



0.906 



0.997 



0.986 



0.980 



0.907 



0.993 



0.930 



0.688 



0.854 



0.893 



0 .635 



0.976 



0.753 
0.909 



323 



324 



328 



1-22 



1-17 



0.998 
0.982 



0.971 



0 .954 



0.865" 



329 



330 



-22 



.-33 



0.963 



.978 



0.924 
0 .841 



331 



332 



333 



334 



"335" 



"336" 



338 



339 



340 



343 



344 



345 



346 



347 



351 



1-24 



1-24 



1-20 



1-27 



1-20 



1-38 



1-27 



1-36 



1-27 



1-19 



1-22 



1-19 



1-22 



1-24 



1-21 



0 .920 



0 . 975 
0.S84 



0.899 



0.942 



0.942 



0.973 



0.979 
0.888 



0.971 



0.994 



0.966 



0.936 



0.963 



0.982 



0.918 



0.712 
0.881 



0.941 



0.567 
0.813 



0.653 
0.772 



0.804 



0 .865 



0 .928 



0 .687 



0.822 



0.924 



0.966 



0.815 
0.912 



352 



1-31 



0 .988 



355 



1-31 



1-29 



0.974 



0.932 



0.839 



0 .632 



356 



1-15 



1-33 



0.994 



0.935 



0.969 



0.726 



3S0 



361 



362 



1-27 



1-25 



1-22 



0.938 
0.954 



0 .929 



0.827 



0.674 



0 .788 
0.715 



363 



364 



~365" 



1-21 



1-33 



0 .881 



0.978 



0.978 
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POS I TI ON OF 

O lulln 1 i J.vt AMI J. 


MaxS (MAXIMUM 
SCOR3) 


MeanS (MEAN 
SCORE) 


366 


1-21 


u .7ib 


0 . 820 


367 


1-19 




0.822 


368 


1 -29 


ft tm 


0 .874 


370 


1-24 


0 . 920 


0 . 712 


371 


1-24 


0 . 961 


0 . 773 


372 


' "l-27 


0.919 


0 . 768 


373 


1-19 


0.986 


0 . 945 


375 


1-32 


0 . 994 


0 . 932 


376 


1-34 




0 . 810 


377 


1-17 


0 995 


0 . 950 


378 


1-49 


0.971 


0.749 


380 


1-20 


l; . jo o 


0 . 874 


381 


1-20 


0 . 928 


0.782 


382 


1-19 


0.986 


0 . 934 


383 


1-28 


0.965 


0 . 829 


384 


1-39 


0 . 970 


0 .551 


386 


1 - 24 


0 . 975 


0 . 881 


388 




0 . 989 


0. 868 


389 


1 — 19 


0 . 984 


0 . 941 


390 


1-26 


0 . 971 


0.782 


392 


1 — 20 


0.981 


0 .900 


393 




0 . 96 8 


0 .890 


394 


1 — 23 


0 . 937 


0.701 


397 


1 — 22 """ — 


0 . 985 


0. 8S4 


399 


-L — 


0 .977 


0.698 


401 


1 — 20 


0 . 899 


0 .567 


4 02 


1 — 22 


0 . 967 


0.931 


403 


1 — 27 


0 .992 


0 .934 


404 


1- 19 


0 .991 


0.973 


405 




0 . 994 


0 .921 


407 


l-JD 


0 . 987 


0.658 


408 


J. — J 1# 


0 .976 


0.551 


409 


1- 33 


0 .897 


0.570 


410 




0 .990 


0.962 


411 


J. — o o 


0 . 977 


0 .827 


412 


1—20 


0 . 944 


0.768 


413 


1—20 


0.988 


0.965 


414 




0 . 993 


0 .638 


415 


1-23 


0 - 981 


0.940 


417 


1-29 


0 . 941 


0 .672 


418 




0 . 952 


0.850 


419 


1-19 


0.986 


0 .967 


420 


1-29 


0 . 965 


0.861 


421 


1-22 


0 . 889 


0.785 i 


422 


1- 4 8 


0 . 982 


0 . 862 


424 


1-19 


0 . 979 


0 . 933 


428 


1-38 


0 . 942 


0 . 653 


430 


1-18 


0 . 947 


0 .595 


4 32 


1-33 




0 . 789 


433 


1-26 


0 . 979 | 


0 . 904 


434 


1-27 


0.962 


0 . 777 


435 


1-24 


C% COD 


0 . 977 


436 


1-27 


0 . 973 


0 .772 


443 


1-15 


0 . 966 


0 . 940 


448 


1-36 


0 . 979 


0 . 804 


453 


1-41 


0.958 


0 .609 


455 " ■ 


1-33 " " 


0.943 


0.606 


457 


1-27 


0.888 * 


0.597 


462 


1-16 " " 


0.925 


0.681 


486 


1-27 


0.972 


0.845 


495 


1-24 


0.917 


0.636 


498 


1-26 


0.993 


0.890 


505 


1-20 


0.976 


0.926 


507 


1-17 


0.966 


0.687 


510 


1-23 


0.930 


0.593 
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SEQ ID NO: 


POSITION OF 


MaxS (MAXIMUM 


MeanS (MEAN 


SIGNAL IN AMINO 


SCORE) 


SCORE) 




ACID SEQUENCE 






511 


1-23 


0.93 0 


0 .593 


512 


1-23 . 


0.930 


0 .593 


515 


1-18 


0.978 


O .956 


523 


1-19 


0.936 


0.822 


529 


1-22 


0.963 L 


0.924 


545 


1-24 


0.982 


0.966 


550 


1-30 


0.933 


0.713 


552 


1-21 


0.973 


0.912 


554 


1-23 


0.969 


0 .784 


571 


1-21 


0 .918 


0.815 


574 


1-31 


0.988 


0.912 


580 


1-39 i 


0.S25 


0.556 


594 


1-31 L 


0.974 


0.839 


608 


1-29 


0.932 


0.632 


609 


1-29 


0.532 


0.632 


610 


1-21 


0 .990 


0.948 


621 


1-15 


0 .994 


0.969 


623 


1-33 


0.935 


O. 726 


653 


1-27 


0 .93 8 


0.827 


669 


1-22 


0 .529 


0 . 788 


677 


1-16 


0 .94 8 


0.807 


685 


1-21 


0.881 


0.715 


699 


1-22 


0.975 


0.816 


702 


1-31 


0.968 


0.898 


707 


1-16 


0 .860 


0.562 


713 


1-25 


0.966 


0.743 


71B 


1-19 


0.936 


0.822 


719 


1-20 


0 .961 


0.824 


729 


1-29 


0.972 


0.874 


735 


1-46 


0 .903 


0.598 


746 


1-14 


0.916 


0.73 0 


747 


1-22 


0.965 


0.876 


748 


1-29 


0.968 


0.785 


759 


1-24 


0.961 


0-773 


767 


1-27 


0.919 


0.768 


768 


1-33 


0.900 


0 .585 


773 


1-42 


0.959 


0 .702 


779 


1-19 


0.986 


0. 945 


797 


1-19 


0.944 


0 .759 


798 


1-19 


0 . 900 


0 .568 


820 


1-17 


0.995 


0.950 


827 


1-49 


0.971 


0 .749 


848 


1-20 


0.968 


0. 874 


864 


1-20 


0.928 


0.782 


866 


1-19 


0. 986 


0. 934 


873 


1-23 


0. 948 


0 .886 


881 


1-28 


0.965 


0. 829 


887 


1-39 


0.970 


0.551 


927 


1-30 


0.985 


0.868 


934 


1-48 


0.988 


0. 777 


939 


1-39 


0.994 


0.889 


944 


1-26 


0.971 


0. 782 


950 


1-29 


0 . 957 


0.845 


963 


1-20 


0.981 


0 . 900 


964 


1-20 


0.886 


0.558 


973 


1-16 


0. 968 


0 - 890 


980 


1-34 


0.961 


0.749 


981 


1-20 


0.953 


0.822 


984 


1-12 


0.938 


0.780 


1015 


1-22 


0.985 


0-854 


1040 


1-46 


0.977 


0.698 


1052 


1-18 


0. 969 


0.842 


1059 


1-20 


0.927 


0.867 


1065 


1-33 


0.983 


0 .918 


1069 


1-22 


0.993 


0.935 
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SKQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


SCORE) 


MeanS (MEAN 

bCORE) 


1075 


1-27 


0.992 


0 . 934 


1000 


1-19 


0.931 


0 . 829 


1092 


1-19 


0.991 


0 . 973 


1094 


1-46 


6 .992 


0.653 


1095 


1-30 


0.974 


0 . 929 


1105 


1-23 


0.994 


0 . 921 


1123 


1-35 


0.987 


0 . 658 


1138 


1-32 


0.954 


" 0.613 


1140 


1-39 


0.989 


0.789 


1142 


1-33 


0.897 


0.570 


1152 


1-25 


0 .990 


0 . 962 


1170 


1-38 


0.977 


0.827 


1176 


1-20 


0 .944 


0.768 


1187 


1-20 


0 .988 


0 965 


1189 


1-35 


0.967 


0.839 


1192 


1-46 


0 .993 


v.OJO 


1193 


1-16 


0.925 


0.710 


1197 


1-29 


0 . 985 


n a tzi 

U . O 3J 


1208 


1-23 


0.981 


0 . 94 0 


1225 


1-29 


0 . 941 


0 . 672 


1245 
1258 


1-19 
1-29 


0 . S86 
0 . 965 


0 . 967 


1265 


1-22 


0.889 


0.861 
0.785 


1266 


1-20 


0 . 944 


0 . 809 


1276 


1-48 


0 . 982 


0 . 862 


1292 


1-19 


0 . 979 


0 . 933 | 


1296 


1-21 


0 . 98^ 


0 . 944 


1297 


1-19 


0.984 " " — 


0 . 953 


1332 


1-38 


0 . 942 


0 . 653 


13 58 


1-18 


0 . 947 


0 . 595 


13 71 


1-33 


□ _ 357 


0.789 


1380 


1-26 


0.979 " 


0 . 904 


13 97 


1-27 


0 . 962 


0 . 777 


1399 


1-23 


0 . 997 


0 . 960 


14 04 


1-24 




0 . 977 


1410 


1-15 


0 . 946 


0 - 845 


1414 


1-24 


0.913 


0 . 588 


1415 


1-19 


0 . 982 


0 . 929 


1416 


1-12 


0.931 


0.891 




1-30 


0.933 


0.563 


1420 


1-20 


0 . B81 


U . DO JL 


1421 


1-19 


0 . 990 


0 . 96 B 


1423 


1-17 


0 . 968 


U . OD J 


1424 


1-21 


0. 885 




1425 


1-24 


0. 913 


0 . 588 


1426 


1-24 


0.913 


0 . 588 


1428 


1-25 


0.957 


0.899 


143 0 
1431 


1-34 
1-28 


0. 977 
0.979 


0.819 


1432 


1-36 


0.957 


0 . 923 
0.613 


1433 


1-32 


0.921 


0.753 


1434 


1-39 


0.983 


0. 621 


1435 


1-25 


0.910 


0.631 


1436 


1-42 


0.988 


0.868 


1437 


1-22 


0.998 


0.980 


1442 
*% o 


1-20 
1-12 


0.918 
0.931 


0.753 


1462 


1-18 


0,968 


0.891 
0.888 


14 90 
1518 
1525 
154 7 
1561 
1580 
1593 


1-20 
1-17 
-L-*l 
1-28 
1-25 
1-1/ 
1-28 


0.881 
0.968 
0.885 
0.974 
0.967 
0.923 
D.979 


0.561 
0.863 
0.591 
0.891 
0.899 
6.824 
0.923 
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SBQ ID NO: 


POSITION OF 


MaxS (MAXIMUM 


MeanS (MEAN 




SIGNAL IN AMINO 


SCORE) 


SCORE) 




ACID SEQUENCE 






1596 


1-16 


0.929 


0 . 709 


1601 


1-36 


0 .957 


0 .613 


1606 


1-22 


0,979 


0 . 831 


1607 


1-20 


0.974 


0 . 770 


1608 


1-32 


0 . 92 1 


n i C"3 

U . / 33 


1614 


1-33 


0.969 


0.829 


1616 


1-20 


0,959 


0.869 


1625 


1-39 


0.983 


0 .621 


1632 


1-25 


0.910 


0 .631 


1636 


1-33 


0.897 


0 .591 


1639 


1-42 


0.988 


0.868 


164 5 


1-20 


0.927 


0.568 


1647 


1-17 


0 .923 


0 .742 


1648 


1-22 


0.998 


0.980 



TRADOCS: 1 4 1 6234. 1 (%CR%01 ! .DOC) 
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TABLE 6 



SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
corresponding 
SEQ ID NO: in 
priority- 
application 


NO: in 
U.S .S.N. 
09/488, 725 


1 


1787 


3573 


5359 


784CIP2 l 


1103 


2 


1788 


3574 


5360 


784CIP2_2 


2673 


3 


1789 


357.5 


5361 


784CIP2 3 


4117 


4 


1790 


3576 


5362 


784CIP2 4 


5556 


5 


1791 


3577 


5363 


784CIP2 5 


5562 


i 6 


1792 


3578 


5364 


784CIP2 6 


5562 


7 


1793 


3579 


5365 


784CIP2 7 


5562 


i 8 


1794 


3580 


5366 


784CIP2 8 


5562 


9 


1795 


3581 


5367 


784CIP2 9 


5563 


10 


1796 


3582 


5368 


784CIP2 10 


5564 


11 


1797 


3583 


5369 


784CIP2 11 


5565 


12 


1798 


3584 


5370 


784CIP2 12 


5689 


13 


1799 


3585 


5371 


784CIP2 13 


5729 


14 


1800 


3586 


5372 


784CIP2 14 


5745 1 


IS 


1801 


3587 


5373 


784CIP2 15 


5777 


16 


1802 


3588 


5374 


784CIP2 16 


5777 


17 


1803 


3589 


5375 


784CIP2 17 


5789 


IB 


1804 


3590 


5376 


784CIP2 18 


5792 


19 


1805 


3591 


5377 


784CIP2 19 


5804 


20 


1806 


3592 


5378 


784CIP2 20 


5805 


21 


1807 


3593 


5379 


784CIP2 21 


5905 


22 


1808 


3594 


5380 


784CIP2 22 


5844 


23 


1809 


3595 


5381 


784CIP2 23 


5844 


24 


1810 


3596 


5382 


784CIP2 24 


5850 


25 


1811 


3597 


5383 


784CIP2 25 


5867 


26 


1812 


3598 


5364 


784CIP2 26 


5973 


27 


1813 


3599 


5385 


784CIP2 27 


5995 


28 


1814 


3600 


5386 


784CIP2 28 


5995 


29 


1815 


3601 


5387 


784CIP2 29 


6005 


30 


IBIS 


3602 


538 8 


784CIP2 30 


6007 


31 


1817 


3603 


5389 


784CIP2_31 


6007 


32 


181B 


3604 


5390 


784CIP2 32 


6009 


33 


1819 


3605 


53S1 


784CIP2_33 


6012 


34 


1820 


3606 


5392 


7B4CIP2 34 


6015 


35 


1821 


3607 


5393 


704CIP2 35 


6016 


36 


1822 


3608 


5394 


784CIP2 36 


6016 


37 


1823 


3609 


5395 


7B4CIP2_37 


6018 


38 


. 1824 


3610 


5396 


784CIP238 


6018 


39 


1825 


3611 


5397 


784CIP2_39 


6018 


40 


1826 


3612 


5398 


7B4CIP2_40 


6023 


41 


1827 


3613 " 


5399 


784CIP2 41 


6070 


42 


1828 


3614 


5400 


784CIP2 42 


6081 


43 


1829 


3615 


5401 


784CIP2_43 


6089 


44 


1830 


3616 


5402 


784CIP2 44 


6118 


45 


1831 


3617 


5403 


784CIP2 45 


6118 


46 


1832 


3618 


54 04 


784CIP2 46 


6130 


47 


1833 


3619 


5405 


784CIP2 47 


6177 


48 


1834 


3620 


5406 


784CIP2 4B 


6189 


49 


1835 


3621 


5407 


784CIP2 49 


6191 


50 


1836 


3622 


5408 


784CIP2 50 


6204 


51 


1837 


3623 


5409 


784CIP2_51 - 


6204 


52 


1838 J 


3624 


5410 


784CIP2_52 


6284 


53 


1839 


3625 


5411 


784CIP2 53 


6367 


54 


1840 


3626 


5412 


784CIP2_54 


6436 


55 


1841 


3627 


5413 


784CIP2_55 


6442 


56 


1842 


3628 


5414 


784CIP2_56 


6445 


57 


1843 


3629 


5415 


784CIP2 57 


6457 


58 


1844 


3630 


5416 


784CIP2_5B 


6458 


59 


1845 


3631 


5417 


784CIP2 59 


6458 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO i 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




60 


1846 


3632 


5418 


784CIP2_60 


6462 


6a 


1847 


3633 


5419 


784CIP2_61 


6472 


62 


1848 


3634 


5420 


784CIP2_62 


6499 


63 


1849 


3635 


5421 


784CIP2_63 


64 99 


64 


1850 


3636 


5422 


784CIP2_64 


6505 


65 


1851 


3637 


5423 


784CIP2_65 


6534 


66 


1852 


3638 


5424 


784CIP2_66 


6534 


67 


1853 


3639 


5425 


784CIP2 67 


6540 


68 


1B54 


3640 


5426 


784CIP268 


6550 


69 


1855 


3641 


5427 


784CIP2_69 


6550 


70 


1856 


3642 


5428 


784CIP2J70 


6592 


71 


1B57 


3643 


5429 


784CIP2_71 


6645 


72 


1958 


3644 


5430 


784CIP272 


6671 


73 


1959 


3645 


5431 


784CIP2_73 


6763 


74 


I860 


3646 


5432 


784CIP2_74 


6763 


75 


1361 


3647 


5433 


784CIP2_75 


6786 


76 


1862 


3648 


5434 


784CIP2J76 


6824 


77 


1863 


3649 


5435 


784CIP2_77 


6830 


78 


1864 


3650 


5436 


784CIP2_78 


6831 


79 


1865 


3651 


5437 


784CIP2_79 


6832 


80 


1866 


3652 


5438 


784CIP2_80 


6834 


81 


1867 


3653 


5439 


784CIP2_81 


6834 


82 


1868 


3654 


5440 


784CIP2_82 


6835 


83 


1859 


3655 


5441 


784CIP2_83 


6837 


84 


1870 


3656 


5442 


784CIP2_84 


6843 


85 


1871 


3657 


5443 


784CIP2_85 


6859 


86 


1872 


3658 


5444 


784CIP2_86 


"6915 


87 


1873 


3659 


5445 


784CIP2_B7 


6932 


88 


1874 


3660 


544 6 


784CIP2_88 


6957 


89 


1875 


3661 


544 7 


784CIP2__89 


6961 


90 


1876 


3662 


544 8 


784CIP2_90 


6973 


91 


1877 


3663 


544 9 


784CIP2_91 


6973 


92 


1878 


3664 


5450 


7B4CIP2_93 


7007 


93 


1879 


3665 


5451 


784CIP2_94 


7018 


94 


1880 


3666 


5452 


784CIP2_95 


7019 


95 


1881 


3667 


5452 


784CIP2_96 


7020 


96 


1882 


3668 


5454 


784CIP2_97 


7020 


97 


1883 


3669 


5455 


784CIP2_98 


7021 


98 


1884 


3670 


5456 


784CIP2 99 


7023 


99 


1885 


3671 


5457 


784CIP2_100 


7027 


100 


1886 


3672 


5458 


784CIP2_101 


7028 


101 


1887 


3673 


5459 


784CIP2_102 


7029 


102 


18B8 


3674 


5460 


7B4CIP2JL03 


7031 


103 


1889 


3675 


5461 


784CIP2_104 


7032 


104 


1890 


3676 


5462 


784CIP2JL05 


7033 


105 


1891 


3677 


5463 


784CIP2_106 


7035 


106 


1892 


3678 


5464 


784CIP2JL07 


7036 


107 


1893 


3679 


5465 


784CIP2_108 


7039 


108 


1894 


3680 


5466 


784CIP2_109 


7043 


109 


1895 


3681 


5467 


784CIP2_110 


7044 


110 


1896 


" 36 82 


5468 


784CIP2_111 


7046 


111 


1897 


3683 


5469 


784CIP2_112 


7054 


112 


1898 


3684 


5470 


784CIP2_113 


7061 


113 


1899 


3685 


5471 


784CIP2_114 


7077 


114 


1900 


3686 


5472 


784CIP2_JL15 


7092 


115 


1901 


3687 


5473 


784CIP2_116 


7094 


116 


1902 


3688 


5474 


784CIP2_117 


7106 


117 


1903 


3689 


5475 


7B4CIP2_118 


7107 


118 


1904 


3690 


5476 


784CIP2_119 


7111 


119 


1905 


3691 


5477 


784CIP2_120 


7123 


120 


1906 


3692 


5478 


784CIP2121 


7142 


121 


1907 


3693 


5479 


784CIP2_122 


7142 
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SEQ ID NO : 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


obU J. D NO : 
of contig 

J*m,J.C Vjf L. J. LJ KZ 

sequence 


SEQ ID 
NO : 

of contig 


Priority 
docket number_ 
corresponding 
oEQ ID NO; in 
prion ty 

AT\r\T { 4 nn 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


122 


1908 


3694 


5480 


784CIP2 ll'i 


— 

7154 


123 


1909 


3695 


54 81 


7B4CTP? 1 ?4 


7160 


124 


1910 


3696 


5482 


784CTP? 


7169 


125 


1911 


3697 


54 83 


784CIPP 126 


"71 Off 

/lob 


126 


1912 


3698 


5484 


7B4CIP2 127 


71 97 


127 


1913 


3699 


54 85 


704CIP2 1?8 


7219 


128 


1914 


"3 700 


5486 


784 CIP2 129 


7226 


129 


1915 


3701 


5487 


784 PTP5 Tin 


72 29 


130 


1916 


3702 


54 88 


784 CTP2 111 


7234 


131 


1917 


3703 


5489 




7235 


132 


1918 


3 704 


54 90 




7235 


133 


1919 


3705 


5491 




72 3 8 


134 


1920 


3706 


54 92 




7247 


135 


1921 


3707 


54 93 




7261 


136 


1922 


3708 


54 94 


/o^v-lfZ 137 


7262 


13 7 


1923 


3709 


54 95 


1 O'tS—lf+i lit) 


7267 


138 


1924 


3710 


54 96 


/04CIP2 139 


7272 


139 


1925 


3711 


54 97 


/o4C_IP2 140 


7273 


140 


1926 


3 712 


54 98 


784CIP2 141 


7282 


141 


1927 


3713 




784CIP2 142 


7288 


142 


1928 


3714 


SS 00 


/o4C LF2 143 


7291 


143 


1929 


3 715 


55 01 


784CIP2 144 


7293 


144 


1930 


3716 


55 02 


784CIP2 145 


7294 


145 


1931 


3717 


5503 


784CIP2 146 


7299 


146 


1932 


3 718 




784CIP2 147 


7300 


147 


1933 


3719 


gene 


784CIP2 148 


7312 


14 8 


1934 


3 720 


5506 


784CIP2 149 


7313 


149 


1935 


3 721 


5507 


784CIP2 150 


7315 


150 


1936 




55C8 


784CIP2 151 


7318 


151 


1937 


3 723 


55C9 


784CIP2 152 


7321 


152 


1938 


3 724 


cri n 


784CIP2 153 


733 0 


153 


1939 


3725 


33X1 


784CIP2_154 


7331 


154 


1940 


3726 


5512 


784CIP2 155 


7333 


155 


1941 


3727 


5513 


/B4CXP2 156 


7350 


156 


1942 


3728 




784CIP2 157 


7352 


157 


1943 


3729 


551 5 


/o4CIP2 15B 


7384 


158 


1944 


3730 


5516 




7403 j 


159 


1945 j 


3731 


5517 


/of(i-±r*z lou 


7431 


160 


1946 


3732 


5518 


"7 El ^ T T> O 1 f 1 

/OHK-JLirZ lbl 


7441 


161 


1947 


3733 


5519 




7453 


162 


1948 


3734 


5520 


7S4f*TP2 tfi^ 
/ QtvlrA IbJ 


746 7 


163 


1949 


3735 


5521 


/OIUIFX JLo9 


7471 


164 


1950 


3 736 


5522 




7493 j 


165 


1951 


3737 


5523 


/ o*±v_ j.ir.<j lob 


7502 


166 


1952 


3738 


5524 


• ° i> — LrZ ID / 


7511 


167 


1953 


3739 


5525 


784CTP2 1 KR 


TCI /I — — ■ 


168 


1954 


3740 


5526 


784CTP2 icq 


7520 


" 169 


1955 j 


3741 


5527 


784CTIP2 I7f» 




170 


1956 


3742 


5528 


/0*±k__LJrZ 1 rl 


7570 


171 


1957 


3743 


5529 




7S78 


172 


1958 


3744 


5530 




7583 


173 


1959 


3745 


5532 




7592 


f ' 174 


1960 


3746 f 


5532 


784CIP2_175 


7601 


175 


1961 


3747 


5533 


784CIP2_176 


7602 


176 


1962 


3748 


5534 


784CIP2_177 


7608 


177 


1963 


3749 


5535 


784CIP2 178 


7615 


178 


1964 


3750 


5536 


784CIP2_179 


7617 


179 


1965 


3751 


5537 


784CIP2 181 


7624 


180 


1966 


3752 


5538 ' 


784CIP2 182 


7626 


1B1 


1967 


3753 


5539 


784CIP2 183 


7640 


1B2 


1968 


3 754 


5540 


784CIP2 184 


' 7641 


183 


1969 


3755 


5541 


784CIP2_18S 


7641 
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SEQ ID NO: 
of full- 


SEQ ID 
NO: of 


SEQ ID NO: 
of con tig 


SEQ ID 
NO: 


Priority 
docket number^ 


SEQ ID 
NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




184 


1970 


3756 


5542 


784CIP2_186 


7641 


185 


1971 


3757 


5543 


784CIP2_187 


7642 


186 


1972 


3758 


5544 


784CIP2_188 


7649 


187 


1973 


3759 


5545 


7 84CIP2JL89 


7656 


188 


1974 


3760 


5546 


784CIP2190 


7657 


189 


1975 


3761 


5547 


7 84CIP2_191 


7657 


190 


1976 


3762 


5548 


784CIP2_192 


7662 


191 


1977 


3763 


5549 


784CIP2_193 


7668 j 


192 


1978 


3764 


5550 


784CIP2_194 


7673 


193 


1979 


3765 


5551 


784CIP2_195 ! 


7690 


194 


1980 


3766 


5552 


784CIP2_196 


7700 


195 


1981 


3767 


5553 


784CIP2_197 ; 


7709 


196 


1932 


3768 


5554 


7 84CTP2_198 


7736 


197 


1983 


3769 


5555 


784CIP2 199 


7737 


198 


1984 


3770 


5556 


7 84CIP2_200 


7744 


199 


1985 


3771 


5557 


784CIP2 201 


7771 


200 


1986 


3772 


5558 


784CIP2_202 


7786 


201 


1987 


3773 


5559 


784CIP2_203 


7791 


202 


1988 


3774 


5560 


784CIP2J204 


7797 


203 


1989 


3775 


5561 


784CIP2_205 


7806 


204 


1990 


3776 


5562 


784CIP2 206 


7812 


205 


1991 


3777 


5563 


784CTP2 207 


7812 


206 


1992 


3778 


5564 


7 84CTP2_208 


7818 


207 


1993 


3779 


5565 


784CIP2_209 


7822 


208 


1994 


3780 


5566 


784CIP2_210 


7827 


209 


1995 


3781 


5567 


784CIP2 211 


783 0 


210 


1995 


3782 


5568 


784CIP2_212 


7835 


211 


1997 


3783 


5569 


784CIP2_214 


7840 


212 


199B 


3784 


5570 


7 84CIP2_215 


7858 


213 


1999 


3785 


5571 


7 84CIP2_216 


7858 


214 


2000 


3786 


5572 


7 84CTP2_217 


7861 


215 


2001 


3787 


5573 


784CIP2_218 


7866 


216 


2002 


3788 


5574 


784CIP2_219 


7868 


217 


2003 


3789 


5575 


7 84CIP2_220 


7896 


218 


2004 


3790 


5576 


784CIP2_221 


7898 


219 


2005 


3791 


5577 


784CIP2__222 


7900 


220 


2006 


3792 


557 8 


784CIP2_223 


7906 


221 


2007 


3793 


5579 


784CTP2_224 


7908 


222 


2008 


3 794 


5580 


784CIP2_225 


7909 


223 


2009 


j 3795 


5581 


784CIP2_226 


7917 


224 


2010 


3796 


5582 


784CIP2_227 


7932 


225 


2011 


3797 


5583 


784CIP2_228 


7940 


226 


2012 


379B 


5584 


784CIP2_229 


7940 


227 


2013 


3799 


5585 


784CIP2_23 0 


7984 


228 


2014 


3800 


5586 


784CIP2_231 


7984 


229 


2015 


3801 


5587 


784CIP2232 


8001 


230 


2016 


3802 


5588 


784CIP2_233 


8021 


231 


2017 


3803 


5589 


784CIP2_234 


8029 


232 


2018 


3 804 


5590 


784CIP2_235 


8033 


233 


2019 


3805 


5591 


784CIP2_236 


8040 


234 


2020 


3806 


5592 


7B4CIP2_237 


8052 


235 


2021 


3807 


5593 


784CIP2_238 


8096 


236 


2022 


3808 


5594 


784CIP2_239 


8096 


237 


2023 


3809 


5595 


784CIP2J240 


8113 


238 


2024 


3810 


559* 


784CIP2 241 


8126 


239 


2025 


3811 


5597 


784CIP2_242 


8132 


240 


2026 


3812 


5598 


784CIP2J243 


8137 


241 


2027 


3813 


5599 


784CIP2J244 


8137 


242 


2028 


3814 


5600 


784CIP2_245 


8159 


243 


2029 


381S 


5501 


784CIP2_246 


8159 


244 


2030 


3816 


5602 


784CIP2_247 


8161 


245 


2031 


3817 


5603 


784CIP2_248 


8176 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 

NO: 

of confcig 
pept ide 
sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


246 




3 818 


5604 


784CIP2 249 


8196 


247 




3 819 


5605 


784CIP2 250 


8200 


248 


2034 


3 820 


5606 


784CIP2 251 


8212 


24 9 


2035 


3821 


5607 


784CIP2 252 


L 8220 


2S0 


2 03 6 


3 822 


5608 


784CIP2 253 


8238 


251 


2 03 7 


3823 


5609 


784CIP2 254 


B254 


252 


2038 


3824 


5610 


784CIP2 2S5 


8255 


253 


2039 


3 825 


5611 


784CIP2 256 


j_ 8288 


254 


2 04 0 


3826 


5612 


784CIP2 257 


8296 


255 


204 1 


3827 


5613 


784CIP2 258 


8329 


256 


2042 


3828 


5614 


784CIP2 259 


8362 


257 


2043 


3829 


5615 


784CIP2 260 


8429 


258 


2044 


3830 


1 5616 


784CXP2_261 


8436 ■ 




2045 


3831 


5617 


784CIP2 262 


8448 


o c n 

nr-i - 


2046 


3832 


5618 


. 784CIP2 263 


8472 


*b X 


2047 


3833 


5619 


784CIP2 264 


8502 




2048 


3834 


5620 


784CIP2 265 


8504 


Zoo 


204 9 


3835 


5621 


784CIP2 266 


8507 




2050 


3836 


5622 


784CIP2268 


8509 


265 


2051 


3837 


5623 


784CIP2 269 - 


8515 


266 


2052 


3838 


5624 


784CIP2 270 


8519 


267 


2053 


3839 


5625 


784CIP2_271 


8530 


268 


2054 


3840 


5626 


784CIP2_272 


8532 


269 


2055 


3841 


5627 


784CIP2_273 


8532 


270 


2056 


3842 


5628 


784CIP2 274 


B539 


271 


2057 


3843 


5629 


784CIP2275 


8541 


272 


2058 


3844 


5630 


784CIP2 276 


8543 


273 


2059 


3845 


5631 


784CIP2 277 


8S93 


274 


2060 


3846 


5632 


784CIP2 278 


8595 


275 


2061 


3847 


5633 


784CIP2_279 


8615 


276 


2062 


3848 


5634 


784CIP2 280 


8620 


277 


2063 


3849 


5635 


784CIP2_28l 


8621 


278 


2064 


3850 


5636 


784CIP2 282 


8623 


279 


2065 


3851 


5637 


784CIP2 283 


8625 


280 


2066 


3852 


5638 


784CIP2_284 


8628 


tol 


2067 


3853 


563S 


7B4CIP2_28 5 


8628 


2 82 


2068 


3 854 


5640 


784CIP2 286 


8629 




2069 


3855 


5641 


784CIP2 287 


8630 


284 


2070 


3856 


5642 


784CIP2 288 


8631 


285 


2071 


3 857 


5643 


784CIP2 289 


8633 


286 


2072 


3858 


5644 


784CIP2 290 


8634 


287 


2 073 


3 859 


5645 


784CIP2_291 


8635 


288 


2 074 


3 860 


5646 


784CIP2 292 


8636 


289 


IZi 


3861 


5647 


784CIP2 293 


8659 


290 


*u/b 


3862 


5648 


784CIP2_294 


8660 


291 


2077 


3 863 


5649 


784CIP2 295 


B667 


292 


2078 * 


3 864 


5650 


784CIP2 296 


8667 


293 


2079 


3 865 


5651 


784 CI P2 297 


8685 


294 


2080 


3 B66 


5652 


784CIP2 298 


B805 


295 


2081 


3 867 


5653 


784CIP2_299 


8896 


296 


2082 


3668 


5654 


784CIP2 300 


8978 


297 


'UOJ 


3869 


5655 


784CIP2 301 


9046 


298 


2084 


3870 


5656 




904 8 


299 


208S 


3871 


5657 


784CIP2 303 


9116 


300 


2086 


3872 


5658 


784CIP2_304 


9195 


301 


2087 


3873 


5659 


784CIP2 305 


9201 


302 


2088 


3874 


5660 


784CIP2306 


9307 


303 


2089 


3875 


5661 


7B4CIP2_307 


9321 


304 


2OS0 


3876 


5662 


7B4CIP2_308 


9397 


305 


2091 


3877 


5663 


784CIP2 309 


9405 j 


306 


2092 


3878 


5664 


7B4C1P2_310 


9406 


307 


2093 


3879 


5665 


784CIP2 311 


9422 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


NO : in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length* 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




308 


2094 


3880 


5666 


784CIP2_312 


9494 


309 


2095 


3881 


5667 


78 4CIP2_313 


9512 


310 | 


2096 


3882 


5668 


784CIP2 314 


9632 


311 


2097 


3883 


5669 


784CIP2_315 


9661 


312 


2098 


3884 


5670 


784CIP2_316 


9664 


313 


2099 


3885 


5671 


784CIP2_317 


9691 


314 


2100 


3886 


5672 


• 784CIP2 318 


9700 


315 


2101 


3887 


5673 


784CIP2JU9 


9716 


316 


2102 


3888 


5674 


7 84CIP2_320 


9721 


317 


2103 


3889 


5675 


784CIP2_321 


9870 


318 


2104 


3890 


5676 


784CIP2_322 


9887 


319 


2105 j 


3891 


5677 


784CIP2 323 


9923 


320 


2106 


3892 


5678 


784CIP2_3 24 


9938 


321 


2107 


3893 


5679 


784CIP2_325 


9964 


322 


2108 


3894 


5680 


784CIP2_326 


10007 


323 


2109 


3895 


5681 


784CIP2_327 


10009 


324 


2110 


3896 


5682 


784CIP2_328 


10046 


325 


2111 


3897 


5683 


784CIP2_329 


10156 


326 


2112 


3898 


5684 


784CIP2_330 


10276 


327 


2113 


3899 


5685 


784CIP2_331 


10283 


328 


2114 


3900 


5686 


784CIP2B1 


152 


329 


2115 


3901 


5687 


784CIP2B_2 


167 


330 


2116 


3902 


5688 


784CIP2B_3 


205 


331 


2117 


3903 


5689 


784CIP2B4 


210 


332 


2118 


3904 


5690 


784CIP2B_5 


225 


333 


2119 


3905 


5691 


784CIP2B_6 


226 


334 


2120 


3906 


5692 


784CIP2B_7 


264 


335 


2121 


3907 


5693 


784CIP2B_8 


268 


336 


2122 


3908 


5694 


784CIP2B_9 


293 


337 


2123 


3909 


5695 


784CIP2B10 


293 


338 


2124 


3910 


5696 


784CIP2B_11 


293 


339 


2125 


3911 


5697 


784CIP2B_12 


302 


340 


2126 


3912 


5698 


784CIP2B_13 


311 


341 


2127 


3913 


5699 


784CIP2B14 


352 


342 


2128 


3914 


5700 


784CIP2B_15 


358 


343 


2129 


3915 


5701 


784CIP2B 16 


368 


344 


2130 


3916 


5702 


784CIP2B_17 


393 


345 


2131 


3917 


5703 


784CIP2B_18 


477 


346 


2132 


3918 


5704 


784CIP2B19 


508 


347 


2133 


3919 


5705 


784CIP2B_20 


508 


348 


2134 


3920 


5706 


784CIP2B_21 


515 


349 


2135 


3921 


5707 


784CIP2B_2 2 


578 


350 


2136 


3922 


5708 


784CIP2B_23 


588 


351 


2137 


3923 


5709 


784CIP2B_24 


591 


352 


2138 


3924 


5710 


784CIP2BJ25 


593 


353 


2139 


3925 


5711 


784CIP2B_26 


594 


354 


2140 


3926 


5712 


784CIP2B_27 


619 


355 


2141 


3927 


5713 


784CIP2B_28 


620 


356 


2142 


3928 


5714 


784CIP2B29 


654 


357 


2143 


3929 


5715 


784CIP2B_30 


692 


358 


2144 


3930 


5716 


784CIP2B_31 


753 


359 


2145 


3931 


5717 


784CIP2B_32 


758 


360 


2146 


3932 


5718 


784CIP2B_33 


787 


361 


2147 


3933 


5719 


784CIP2B_34 


833 


362 


2148 


3934 


5720 


784CIP2B_35 


838 


363 


2149 


3935 


5721 


784CIP2B_36 


870 


364 


2150 


3936 


5722 


784CIP2BJ37 


891 


365 


2151 


3937 


5723 


7B4CIP2B_38 


891 


366 


2152 


3938 


5724 


784CIP2B_39 


921 


367 


2153 


3939 


5725 


784CIP2B_40 


924 


368 


2154 


3940 


5726 


784CIP2B 41 


932 


369 


2155 


3941 


5727 


784CIP2B 42 


942 



276 



BNSDOCID: <WO 0153312A1 J_> 



WO 01/53312 



PCT/US00/34263 



•=>c*\i ±-L> saj : 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full - 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


370 


2156 


3942 


5728 


784CIP2B 43 


958 


371 


2157 


3943 


5 72 9 


/04CIP2B 44 


96 8 


372 


2158 


3944 


573 0 


/84CIP2B 45 


992 


373 


2159 


3945 


573 


/o4ClP2B 46 


1025 


374 


2160 


3946 


5732 




1074 


375 


2161 


3947 


573 3 


784CIP2B 48 


1104 


376 


2162 


3948 


573 4 


/o4l_XJrVB 49 


1114 


377 


2163 


3949 


573 5 




1144 


378 


2164 


3950 


j /o b 


/84CIP2B 51 


1262 


379 


2165 


3951 


*5 / J I 


784CIP2B 52 


1318 


380 


2166 


3952 


573 B 


784CIP2B 53 


1319 


381 


2167 


3 953 


573 9 


784CIP2B 54 


1328 


382 


2168 


3954 


574 0 


784CIP2B 55 


1436 


383 


2169 


3955 


574 1 


784CIP2B 56 


1464 


384 


2170 




574 2 


784CIP2B 57 


1584 


385 


2171 


3 957 


5743 


784CIP2B 58 


1617 


386 


2172 


1 oca 


5744 


784CIP2B 59 


1724 


387 


2 1 73 


3 959 


5745 


784CIP2B 60 


1728 


388 


2174 




5746 


784CIP2B 61 


1772 


389 


2175 


3 961 


574 7 


784C1P2B_62 


1809 


390 


2176 


3962 


574 8 


784CIP2B 63 


1868 


391 


2177 


3 963 


5749 


784CIP2B__64 


1898 


392 


2178 


3 964 


5750 


784CIP2B_65 


1926 


393 


2179 


3965 


5751 


784CIP2B_66 


1965 


394 


218 0 


3 966 


5752 


784CIP2B 67 


1967 


\ 395 


2181 


J 967 


5753 


784CIP2B 68 


1995 


396 


2182 


3 968 


5754 


7B4CIP2B_69 


2005. 


397 


2183 


3969 


5755 


784CIP2B_70 


2027 


398 


2 184 


3 970 


=756 


784CIP2B 71 


2055 


399 


2185 


3 971 


o757 


784CIP2B 72 


2103 


400 


218 6 


3 972 


5758 


784CIP2B_73 


2106 


401 


2187 


3973 


5759 


784CIP2B_74 


2166 


4 02 


2188 


3 974 


5760 


784CIP2B 75 


2175 


403 


2189 


3 975 


5761 


784CIP2B 76 


2176 


404 


2190 


3 976 


5762 


784CIP2B 78 


2236 


405 


2191 


3 977 


5763 


784CIP2B 79 


22S0 


406 


2192 


.5 .7 / a 


5764 


784CIP2B_80 


2300 . 


4 07 


2 193 


3 979 


• 5765 


784CIP2B 81 


2323 


408 


2194 


3 980 


57S6 


784CIP2B 82 


2340 


409 


2195 


3981 


5767 


784CIP2B 83 


2371 


410 


2196 


3982 


5768 


784CIP2B 84 


2399 | 


411 


2197 


3 983 


en c a 


784CIP2B 85 


2411 


412 


2198 


3 984 


5770 


784CIP2B 86 


2428 


413 


2199 


3 985 


C."7 it 
3 f f X 


7B4CIP2B 87 


2430 


414 


2200 


3 986 


5772 


784CIP2B 88 


2439 


415 


2201 


3987 


5773 


/o^Lli'^B By 


2447 


416 


2202 


3988 


5774 




2461 


417 


2203 


3989 


5775 




2487 


418 


2204 


3990 


5776 


784CIP2B 92 


2492 


419 


2205 


3991 


c: "7*7 "7 


784CIP2B 93 


2512 


| 420 


2206 


3 992 


5778 


784CIP2B 94 


2564 


421 


2207 


3993 


5779 


784CIP2B 95 


2678 


422 


2208 


3994 


5780 


784CIP2B 96 


2816 


423 


2209 


3995 


5781 


784CIP2B_97 


2818 


424 


2210 


3996 


5782 


784CIP2B_98 


2819 


425 


2211 


3997 


S783 


784CIP2B 99 


2943 


426 


2212 


3998 


5784 


784CIP2B 100 


3137 


427 


2213 


3999 


5785 


7B4CIP2B 101 


3137 


428 


2214 


4000 


5786 


784CIP2B 102 


3160 


429 


2215 


4001 


5787 


784CIP2B 103 


3323 


430 


2216 


4002 


5788 


784CIP2B 104 


3360 


431 


2217 


4003 


5789 


784CIP2B 105 


3362 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority- 


SEQ ID 


of full- 


NO: of 


of con tig 


NO: 


docket number^ 


NO : in 


length 


full- 


nucleotide 


of contig 


c or responding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


r\ r\ //too TIC 

09/488 , 725 


sequence 


peptide 
sequence 




sequence 


priori ty 
appl icafc ion 




432 


2218 


4004 


5790 


/ OH^Xtr/^O iUO 


3 4 17 


433 


2219 


4005 


5791 


/ 0*3 V- i f^D X\J ' 


3418 


434 


2220 


4006 


5792 


TQ/1 PT D"?R 1 AO 
/ O t *\^Xf^D -LUO 


3 4 42 


435 


2221 


4007 


57 93 


/oHLlr/D 1U3 


3442 


436 


2222 


4008 


5794 


T ft 4 fT TJO"R Tin 
/ oiV^JL Jr«£D XXVJ 


3444 


437 


2223 


4009 


5795 


f O € i\,±tr£0 XXX 


3855 


438 


2224 


4010 


5796 


/o^vlr^D XX £ 


jodj 


439 


2225 


4011 


5797 




4 090 


440 


2226 


4012 


5798 


TO/IPT111D 


/I AC 


441 


2227 . 


4013 


5799 


TO»f TD1I1 TIC 

7o4v,J.riJD xx=> 


4142 


442 


2228 


4014 


5800 


784CIP2B__116 


4142 


443 


2229 


4015 


5801 


784CIP2B IX/ 


4 149 


444 


2230 


4016 


5802 


784CIP2B 118 


4196 


445 


2231 


4017 


5S03 


784CIP2B_119 


4202 


446 


2232 


4018 


5804 


784CIP2B_120 


4274 


447 


2233 


4019 


5805 


784CIP2B 121 


43 04 


448 


2234 


4020 


5806 


784CIP2B_122 


4306 


449 


2235 


4021 


5807 


784CIP2B 123 


4 311 


450 


2236 


4022 


5808 


784CIP2B_124 


4321 


451 


2237 


4023 


5809 


784CIP2B_125 


4323 


452 


2238 


4024 


5810 


784CIP2B_126 


4332 


453 


2239 


4025 


5811 


784CIP2B_127 


4488 


454 


2240 


4026 


5812 


784CIP2B_128 


4588 


455 


2241 


4027 


5813 


784CIP2B_129 


5569 


456 


2242 


4028 


5814 


784CIP2B_130 


5573 


457 


2243 


4029 


5815 


784CIP2B_131 


5577 


458 


2244 


4030 


5816 


784CIP2B_132 


5579 


459 


2245 


4031 


5817 


784CIP2B__133 


5582 


460 


2246 


4032 


5818 


784CIP2B_134 


5583 


461 


2247 


4033 


5819 


784CIP2B_13 5 


5584 


462 


2248 


4034 


5820 


734C1P2B_136 


5585 


463 


2249 


4035 


5821 


7B4CIP2B_137 


5591 


464 


2250 


4036 


5822 


784CIP2B_138 


5593 


465 


2251 


4037 


j 5823 


784CIP2B_139 


5594 


466 


2252 


4038 


5824 


784CIP2B_140 


5594 


467 


2253 


4039 


5825 


784CIP2B_141 


5598 


468 


2254 


4040 


5826 


784CIP2B_142 


5602 


469 


2255 


4041 


5827 


784CIP2B_143 


5605 


470 


2256 


4042 


5828 


784CIP2B_144 


5608 


471 


2257 


t 4043 


5829 


784CIP2B_145 


5617 


472 


2258 


4044 


5830 


784CIP2B_JL46 


5620 


473 


2259 


4045 


5831 


784CIP2B_147 


5622 


474 


2260 


4046 


5832 


7 84CIP2B 14 8 


c<ro^ 


475 


2261 


4047 


5833 


7B4CIP2B__14 3 


=>t»-i4 


476 


2262 


4048 


5834 


784CIP2B_lbO 


5625 


477 


2263 


4049 


5835 


784CIP2B_151 


5627 


478 


2264 


4050 


5836 


7o4CIP2B i-bw6 




479 


2265 


4051 


5837 


7B4CIP2B J.b3 


CCIft 


480 


2266 


4052 


5838 


784L..LP2B Xb4 


30 J £ 


481 


2267 


4053 


5839 


784CIP2B_155 


564 0 


482 


2268 


4054 


5840 


784LIP2B Xb6 


ab9 x 


483 


2269 


4 055 


5841 


784CIP2B_157 


5643 


A OA 


l\i 








5647 


485 


2271 


4057 


5843 


784CIP2B_159 


5649 


486 


2272 


4058 


5844 


784CIP2B_160 


5658 


487 


2273 


4059 


5845 


7B4CIP2B 161 


5659 


488 


2274 


4060 


5846 


784CIP2B_162 


5667 


489 


2275 


4061 


5B47 


784CIP2B_163 


5672 


490 


2276 


4062 


5848 


784CIP2B_164 


5674 


_491 


2277 


4063 


5849 


7 84CIP2B_165 


5678 


492 


2278 


4064 


5B50 


784CIP2B_166 


5680 


493 


2279 


4065 


5851 


784CIP2B 167 


5684 
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of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO : of 
full- 

1 p Tl <-f t" K 

peptide 
sequence 


SEQ ID NO; 
of contig 
nucj.cociuc 


SEQ ID 
NO : 

of contig 
peptide 
tjutrn>_e 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priori ty 
application 


SEQ ID 
NO: in. 
U.S. S.N. 
09/488, 725 


494 


2280 


4066 


5852 


"7fl ACT OR t rr o 


5686 


495 


2281 


4067 


5853 


Vfl/irTD'JR icq 


56 94 


496 


2282 


4068 


5854 


/Oft^.J.f^D J. /U 


5698 


497 


2283 


4069 


5855 


' o *i *k_ J. .f ^ £> X /J. 


5699 


498 


2284 


4070 


5856 


' CiLlr^o X * Z 


5712 


499 


2285 


4071 


5857 




5719 


500 — 


2286 


4072 


5858 


> O *± V_ J. irZD _L /I 


5720 


501 


2287 


4073 


5859 




5727 


502 


2288 


4074 


5860 


f o*±\ — L tr/iti J. / b 


573 0 


503 


2289 


4075 


5861 


/ o4 \_1 rZo 1 / / 


5734 


504 


2290 


4076 


5862 


TQylfTmo i 'in 
/oflLlr^D l/o 


5738 


505 


2291 


4077 


58 63 


/ oqk^xvzd i / y 


5739 


506 


2292 


4 07 8 




784CIP2B 180 


! 5740 


507 


2293 


4 079 


56 65 


783.CIP2B 181 


5744 


508 


2294 


4 080 


5866 


784CIP2B 182 


5748 


509 


2295 


4 081 


bob / 


784CIP2B 183 


5749 


510 


2296 


4082 




1 O'ii^ltfZH 184 


5750 


511 


2297 


4 083 


5669 


784CIP2B 185 


5750 


512 


2298 


4084 


5870 


7B4CIP2B 186 


5750 


513 


2299 


4085 


5871 


784CIP2B 187 


5761 


514 


2300 


i 4086 


58 72 


784CIP2B 188 


5762 


515 


2301 


4 087 


58 73 


784CIP2B 189 


5767 


516 


2302 




5874 


784CIP2B_190 


5773 


517 


2303 


A fXQ Q 


5875 


784CIP2B 191 


5783 


518 


2304 


a non 


5876 


784CIP2B 192 


5784 


519 


23 05 


4091 


5877 


784CIP2B 193 


5788 


520 


23 06 


4092 


5878 


784CIP2B 194 


5798 


521 


2307 


4 093 


5879 


784CIP2B 196 


5807 


522 


2308 


4 094 


5880 


784CIP2B_197 


5818 


523 


2309 


4 095 


5881 


784CIP2B 198 


5819 


524 


2310 


4096 


5882 


784CIP2B_199 


5827 


525 


2311 


^U? / 


5883 


784CIP2B_200 


5828 


526 


2312 


4098 


5884 


7 84CIP2B__201 


5842 


527 


2313 


4099 


5685 


784CIP2B 202 


5853 


528 


2314 


4100 


5886 


784CIP2B__203 


5861 


529 


2315 


4 101 


5887 


7 84CIP2B_204 


5864 


530 


2316 




5888 


784CIP2B 2DS 


5865 


531 


2317 


4103 




T Q A T Tim f\f* 


5871 


532 


2318 


4104 


5890 




5873 


533 


2319 


4105 


5891 


t o*k\-±h , Zi5 ZUo 


5873 


534 


2320 


4106 


5892 


"7 ft A T m "Q OrtQ 


5875 


535 


2321 


4107 


5893 


/ O f*\_J.F.^lJ Z±\J 


5878 


536 


2322 


4108 


5894 


1 QA/^TmO 111 


5879 


53 7 


2323 


4109 


5895 




5880 


538 


2324 


4110 


5896 


/ oiiuir«o zxj 


5880 


! 539 


2325 


4111 


5897 




5880 


540 


2326 


4112 


5898 


784PTP9R "Pic; 


588 0 


541 


2327 


4113 


5B99 


794rTP?R O 1 £ 


5885 


542 


2328 


4114 


5900 


"7 0 fTCTO O 1 T 


5895 


543 


2329 


4115 


5901 


/ O **\—±i?ZtS ZXtS 


5898 


544 


2330 


4116 


^ on*? 




5902 


545 


2331 


4117 


5 903 




5904 


546 


2332 


4118 


5904 


784CIP2B 221 


5918 


547 


2333 


4119 


5905 


784CIP2B 222 


5921 


548 


2334 


4120 


5906 


784CIP2B 223 


5927 


549 


2335 


4121 


5907 


784CIP2B_224 


5932 


550 


2336 


4122 


5908 


784CIP2B_225 


5939 


551 


2337 


4123 


5909 


784CIP2B 226 


5945 


552 


2338 


4124 . 


5910 


784CIP2B 227 


5946 


553 


2339 


4125 


5911 


784CIP2B_228 


5947 


554 


2340 


4126 


5912 


784CIP2B_229 


5956 


555 


2341 


4127 


5913 


784CIP2B 230 


5967 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priori ty 


SEQ ID 


of full- 


NO: of 


of con tig 


NO: 


docket number _ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




556 


2342 


4128 


5914 


7 84CIP2B__232 


5975 


557 


2343 


4X29 


5915 


784CIP2B_233 


5977 


558 


2344 


4130 


5916 


784CIP2B 234 


5978 


559 


2345 


4131 


5917 


784CIP2B_235 


5979 


560 


2346 


4132 


5918 


784CIP2B_236 


5980 


561 


2347 


4133 


5919 


784CIP2B_237 


5988 


562 


234B 


4134 


5920 


7 84CIP2B__238 


5989 


563 


2349 


4135 


5921 


784CIP2B_239 


5991 


564 


2350 


4136 


5922 


784CIP2B_240 


5997 


565 


2351 


4137 


5923 


784CIP2B_241 


5998 


566 


2352 


4138 


5924 


784CIP2B__242 


6003 


567 


2353 


4139 


5925 


784CIP2B_243 


6004 


568 


2354 


4140 


5926 


784CIP2B_244 


6013 


569 


2355 


4141 


5927 


784CIP2B_245 


6028 


570 


2356 


4142 


5928 


7 84CIP2B_246 


6028 


571 


2357 


4143 


5929 


784CIP2B_247 


6029 


572 


2358 


4144 


5930 


784CIP2B_248 


6031 


573 


2359 


4145 


5931 


784CIP2B_249 


6031 


574 


2360 


4146 


5932 


784CIP2B_250 


6032 


575 


2361 


4147 


5933 


784CIP2B_251 


6037 


576 


2362 


4148 


5934 


784CIP2B__252 


6037 


577 


2363 


4149 


5935 


7 84CIP2B_253 


6043 


578 


2364 


4150 


5936 


784CIP2B_254 


6044 


579 


2365 


4151 


5937 


784CTP2B_255 


6046 


580 


2366 


4152 


5938 


784CIP2B_2S6 


6048 


581 


2367 


4153 


5939 


784CIP2B_257 


604 9 


582 


• 2368 


4154 


5940 


784CIP2B 258 


6051 


583 


2369 


4155 


5941 


784CIP2B_259 


6053 


S84 


2370 


4156 


5942 


784CIP2B_260 


6060 


585 


2371 


4157 


5943 


784CIP2B_261 


6063 


586 


2372 


4158 


5944 


784CIP2B_262 


6066 


587 


2373 


4159 


5945 


7 84CIP2B_263 


6067 


588 


2374 


4160 


5946 


784CIP2B_264 


6068 


589 


2375 


4161 


5947 


784CIP2B_265 


6073 


590 


2376 


4162 


5948 


784CIP2B_266 


6076 


591 


2377 


4163 


S949 


7 84CTP2B_267 


6076 


592 


2378 


4164 


5950 


784CIP2B__268 


6077 


593 


2379 


4165 


5951 


784CIP2B_269 


6079 


594 


2380 


4166 


5952 


7B4CIP2B_270 


6082 


595 


2381 


4167 


5953 


784CIP2B_2 72 


6088 


596" 


2382 


4168 


5954 


784CIP2B_273 


6091 


597 


2303 


4169 


5955 


784CIP2B_274 


6094 


598 


2384 


4170 


5956 


784CIP2B_275 


6101 


599 


2385 


4171 


5957 


784CIP2B 276 


6103 


600 


2386 


4172 


5958 


784CIP2B 277 


6104 


601 


2387 


4173 


5959 


784CIP2B_278 


1 6108 


602 


2388 


4174 


5960 


784CIP2B_279 


6112 


603 


2389 


4175 


5961 


784CTP2B_280 


6121 


604 


2390 


4176 


5962 


784CIP2B 281 


6125 


€05 


2391 


4177 


5963 


784CIP2B_282 


6126 


606 


2392 


4178 


5964 


784CIP2B_283 


6128 


607 


2393 


4179 


5955 


784CIP2B_284 


6129 


608 


2394 


4180 


5966 


784CIP2B 285 


6133 


609 


2395 


4181 


5967 


784CIP2B_286 


6133 


610 


2396 


4182 


5968 


784CIP2B_287 


6135 


611 


2397 


4183 


5969 


784CIP2B 288 


6139 


612 


2398 


4184 


5970 


784CIP2B 289 


6141 


613 


2399 


4185 


5971 


784CIP2B 290 


6145 


614 


2400 


4186 


5972 


784CIP2B_291 


6146 


615 


2401 


4187 


5973 


784CTP2B_292 


6148 


616 


2402 


4188 


5974 


784CIP2B_293 


6149 


617 


2403 . 


4189 


5975 


784CIP2B 294 


6149 
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SEQ ID NO; 

n f -FiiT 1 _ 
VJ JL IU1 JL - 

nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number^ 
corresponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO; in 
U.S. S.N. 
09/488,725 


618 


2404 


4190 




5 976 


784CIP2B 295 


6153 


619 


2405 


4 i9x 


e: cj-7-7 


784CIP2B 296 


6159 


620 


2406 


4 192 


CQTO 

3 3 /o 


784CIP2B 297 


6164 


621 


2407 


4 193 




7B4CIP2B 298 


6167 


622 


2408 


4194 




784CIP2B_299 


6172 


623 


2409 


4195 


5981 


784CIP2B 300 


6173 


624 


2410 


4196 




784CIP2B 301 


6190 


625 


2411 


4197 


^ Q ft 1 


784CIP2B 302 


6194 


626 


2412 


4198 




784CIP2B_303 


6196 


627 


2413 


4199 


5985 


784CIP2B_304 


6197 


628 


2414 


a o on 
ft ^ u \J 


5986 


7 84CIP2B 3 05 


6198 


629 


2415 


4201 


5987 


784CIP2B 306 


6198 


630 


24 16 


4202 


5988 


784CIP2B_3 08 


6214 


631 


2417 


4203 


5989 


| 784CIP2B_309 


6215 


632 


241 8 


4204 


5990 


784CIP2B 310 


6219 


633 


2419 


4205 


5991 


784CIP2B 311 


6226 


634 


2420 


42 06 


5992 


784CIP2B 312 


6229 


635 


2421 


42 07 


5993 


784CIP2B 313 


6234 


636 


oa o ~> 


4208 


5994 


784CIP2B 314 


6237 


637 


24 2 3 


4209 


5995 


784CIP2B 315 


6238 


638 


Oa Oa 


4210 


5996 


784CIP2B_316 


6239 


639 


OAO EL 


4211 


5997 


784CIP2B_317 


6239 


640 


2426 


4212 


5998 


784CIP2B 318 


6239 


641 


OA O 1 


4213 


5999 


784CIP2B 319 


6240 


642 


242 8 


4214 


6000 


784CIP2B 320 


6244 


643 


OA O Q 


4215 


6001 


784CIP2B 321 


6245 


644 


o a ^ rv 


4216 


6002 


784CIP2B_322 


6250 | 


645 " 


2431 


4217 . 


6003 


784CIP2B 323 


6252 


646 




a tT5 — ' 

4218 


6004 


784CIP2B 324 


6252 


647 


2433 


4219 


6005 


784CIP2B_325 


6256 


648 


243 4 


4220 


6006 


784CIP2B_326 


6260 


649 




4221 


6007 


784CIP2B_327 


6261 


650 


243 6 


4222 


6008 


784CIP2B 328 


6264 


651 


243 7 


Tbo -> 

4223 


6009 


784CIP2B_329 


6265 


652 




4224 


6010 


784CIP2B_33 0 


6266 


653 


2439 


4225 


6011 


784CIP2B_331 


6270 


654 


244 0 


4226 


6012 


784CIP2B 332 


6271 


655 


2441 


4227 


6013 


784.CIP2B 334 


6274 j 


656 


2442 


Aooa 


6014 


784CIP2B 335 


6276 


657 


2443 


4229 


6015 


784CIP2B 336 


6281 


658 


2444 




6016 


784CIP2B 337 


6281 


659 


2445 




6017 


784CIP2B 338 


6288 


660 


2446 




6018 


784CIP2B 339 


6292 


661 


2447 


4233 


6019 


784CIP2B 340 


6294 


662 


2448 


4234 


6 020 


784CIP2B_343 


6312 


663 


2449 


4235 


c no t 


784CIP2B 344 


6312 


664 


2450 


4236 


6022 


784CIP2B 345 


6312 


665 


2451 


4237 


6023 


784CIP2B 346 


6322 


666 


2452 


423 8 


6 024 


784CIP2B 347 


6324 


667 


2453 


4239 


6025 


784CIP2B 349 


6329 


668 


2454 


4240 


6026 


784CIP2B 350 


6331 


669 


2455 


AO A 1 


6027 


784CIP2B 351 


6333 


670 


2456 


4242 


6028 


784CIP2B 352 


6334 


671 


2457 


4243 


6029 


784CIP2B 353 


6337 


672 


2458 


4244 


6030 


784CIP2B 354 i 


6339 


673 


2459 


4245 


6031 


784CIP2B 355 


6346 


674 


2460 


4246 


6032 


784CIP2B 356 


6348 


675 


2461 


4247 


6033 


784CIP2B 357 


6348 


676 


2462 


4248 


6034 


784CIP2B 350 


6350 


677 


2463 


4249 


6035 


784CIP2B 359 


6351 


678 


2464 


4250 


6036 


784CIP2B 360 | 


6355 


679 


2465 


4251 


6037 


784CIP2B 361 | 


6362 
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BNSDOCID: <WO 0153312A1 J_> 



WO 01/53312 



PCTAJS00/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket nuinber_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488 , 725 




y-^ c-\ r^t t~ -i -~1 . 

peptii ue 




sequence 


priority 












appl i ca t i on 




680 


2466 


4 252 




fo*k\-xl?ZD o D £ 


6368 


681 


2467 


4 253 


6039 


/o4LIi , ^b .5 6 3 


6369 


682 


246 8 


4254 


6040 


/o4LIi , 2B 364 


6371 


683 


2469 


4255 




/84v,IP2B Job 


6376 


684 


2470 


A O Q C 


6042 


/o-flCXir^o Job 


63 79 


685 


2471 


4257 


6043 


/84CIP2B__367 


63 80 


686 


24 72 


ft o 


6044 


i o a /"i t mn *i *~ o 
/B4t-I P2B__J 68 


63 81 


6 B7 


2473 


*± jL b y 




/ B4CXP2B J 69 


63 92 


6 8 8 


2 4 74 


4260 


6046 


784CIP2B__370 


6395 


68 9 




4261 


6047 


784CIP2B_3 71 


6397 


6 90 


Zft / b 


4262 


604 8 


784CIP2B_372 


6400 


D Z7i 


2 477 


4263 


6049 


784CIP2B_373 


6401 


692 


24 78 


4264 


6050 


7Q4CIP2B 374 


6411 


oyj 


2479 


4265 


6051 


784CIP2B_375 


6411 


694 


2480 


4266 


6052 


784CIP2B_376 


6411 


695 


2481 


4267 


6053 


784CIP2B_377 


6416 


696 


24 82 


4268 


6054 


784CIP2B_378 


6418 


697 


24 83 


4269 


6055 


784CIP2B_379 


6422 


69 8 


24B4 


4270 


6056- 


7B4CIP2B_380 


6423 


699 


24 85 


4271 


6057 


784CIP2B_381 


6426 


70 0 


2486 


4272 


6058 


784CIP2B_382 


6427 


701 


24 87 


4273 


6059 


784CIP2B_383 


6428 


702 


24 38 


4274 


6060 


784CIP2B_384 


6429 


703 


2489 


4275 


6061 


784CIP2B_385 


6430 


704 


2490 


4276 


6062 


784CIP2B_3 86 


6432 


705 


2491 


4277 


6063 


784CIP2B_387 


6432 


706 


2492 


4278 


6064 - 


784CIP2B 388 


6438 


707 


2493 


4279 


6065 


7B4CIP2B_389 


6441 


708 


2494 


4280 


6066 


784CIP2B_390 


6446 


709 


2495 


4281 


6067 


784CIP2B_391 


6454 


710 


24 96 


4282 


6068 


784CIP2B_3 92 


6459 


711 


2497 


4283 


6069 


784CIP2B_394 


6461 


712 


2498 


4284 


6070 


784CIP2B_39S 


6467 


713 


2499 


4285 


6071 


784CIP2B 396 


6468 


714 


2500 


4286 


6072 


784CIP2B_397 


64 87 


715 


2501 


4287 


6073 


784CIP2B 398 


6491 


716 


2502 


4288 


6074 


784CIP2B_399 


6S06 


717 


2503 


4289 


6075 


784CIP2B_401 


6514 


718 


25 04 


4290 


6076 


784CIP2B_402 


6519 


TX o 


2505 


4291 


6077 


784CIP2B_403 


6521 


720 


2506 


4292 


6078 


784CIP2B_4 04 


6532 


Ty~l 


2507 


4293 


6079 


784CIP2B_405 


6536 


722 


2508 


4294 


6080 


7 84CIP2B_406 


6543 


723 




4295 


6081 


7 84CIP2B_4 07 


6544 


724 


2510 


4296 


6082 


7 84CIP2B_408 


654 8 


725 


■c. r> xx 


4297 


6083 


7B4CIP2B_ 409 


6551 


726 




4298 


6084 


784CIP2B 410 


6551 


727 


2513 


4299 


6085 


7B4CIP2B__4 11 


6552 


728 




4 300 


6086 


784CIP2B_412 


6554 


729 


OCT C 


4301 


6087 


784CIP2B_413 


6556 


730 


JLO 


4302 


6088 


7B4CIP2B__414 


6560 


/OA 


/Si / 


4303 


6089 


784CIP2B_415 


6563 


732 


2518 


4304 


6090 




6564 


733 


2519 


4305 


6091 


784CIP2B_417 


6567 


734 


2520 


4306 


6092 


784CIP2B 418 


6573 


735 


2521 


4307 


6093 


7 84CIP2B_419 


6575 


736 


2522 


4308 


6094 


784CIP2B 420 


6577 


737 


2523 


4309 


6095 


784CIP2B 421 


6593 


738 


2524 


4310 . 


6096 


784CIP2B 422 


6595 


739 


2525 


4311 


6097 


784CIP2B_423 


6599 


740 


2526 


4312 


6098 


784CIP2B 424 


6625 


741 


2527 


4313 


6099 


784CIP2B 425 


6625 
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BNSDOCtD <WO 0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO : 
oc mil- 

sequence 


SEQ ID 
NO: of 
full - 
icngun 

sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number_ 
corresponding 
SBQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


742 


2528 


4314 


6100 


784C1P2B 426 


6626 


743 


2529 


4315 


6101 


7b4LIr2B_427 


6630 


744 


2530 


4316 


6102 


/O'iUlirZtJ 


6631 


745 


2531 


4317 


6103 


7ftdPTDOQ A 1 0 


663 2 


74 6 


2532 


4318 


6104 


Tfl/1 TTDOD A O 


6633 


747 


2533 


4319 


6105 


TQ/l T mn 

/ D4V.X f Zfc*_4 J 1 


6634 


748 


2534 


4320 


6106 


/o4LIr2D 432 


663 8 


749 


2535 


4321 


6 107 


/o4Lirzb 433 


6641 


750 


2536 


! 4322 


6108 


/o4Lll'2o 434 


6644 


751 


2 537 


4 323 


£i no 
B X U7 


784CIP2B 435 


6646 


752 


2538 


a. to a 


6 110 


784CIP2B_436 


6648 


753 


2539 




C 1 1 7 


784CIP2B 437 


6652 


754 


2540 






784CIP2B 438 


6654 


755 


2541 


4 327 


6113 


784CIP2B 439 


6657 


756 


2542 


4 32 8 


6114 


784CIP2B 440 


6658 


757 


2543 


4329 


6115 


7 84 CI P2B_4 4 1 


6663 


758 


2544 


4330 


6116 


784CIP2B 442 


6664 


■759 


254 5 


4331 


6117 


784C1P2B 443 


6668 


760 




4332 


6118 


784CIP2B 444 


6669 


761 


■? A *7 


4333 


6119 


i 784CIP2B 44S 


6673 


762 




4334 


6120 


784CIP2B 446 


6685 


763 




/ToTE 

4335 


6121 


784CIP2B 447 


6687 


764 


25o0 


4336 


6122 


784CIP2B 448 


6689 


765 


^331 


433/ 


6123 


784CIP2B 449 


6693 


766 


2552 


! 4338 


6124 


784CIP2B 450 


6698 


767 


2 553 


4339 


6125 


784CIP2B 451 


6699 


768 


occ7 


4340 


6126 


784CIP2B 452 


6705 


769 


2555 


4341 


6127 


784CIP2B_453 


6711 


770 


2556 


4342 


6128 


784CIP2B_4S4 


6713 


771 


2557 


4343 


6129 


784CIP2B 455 


6716 


772 




4344 


6130 


784CIP2B 456 


6725 


773 


O C C D 


4345 


6131 


784CIP2B_457 


6726 


1 774 


2560 


4346 


6132 


7 84CIP2B -458 


6727 


775 


2561 


4347 


6133 


784CIP2B 459 


6730 


776 


i3bx 


4348 


6134 


784CIP2B 460 


6730 


777 


^30J 


4349 


6135 


784CIP2B_461 


6730 


778 


2564 


A1 Cfl 


6136 


784CIP2B 462 


6732 


779 


*3\>3 


43 51 


6137 


784CIP2B 463 


6733 


780 


2566 


43 52 


6138 


784CIP2B 4 64 


6737 


781 


2567 


A CI 
4 3 DO 


6139 


784CIP2B 4 65 


6745 


"782 


2568 


43 54 


6140 


784CIP2B_4 66 


6751 


783 


2569 


43 55 


5141 


784CIP2B 467 


6754 


784 


2570 


43 56 


D 


/o4LIt , 2B 468 


6758 


785 


2571 


43 57 ! 


CI A 1 


784CIP2B 469 


6761 


786 


2572 


43 58 


6144 


784CIP2B 470 


6765 


787 


2573 


43 59 


b X43 


/o4CIP2B_471 


6768 


788 


2574 


43 60 


DX**D 


/o4Ll tf^a 4 72 


6773 


789 


2575 


43 61 


6147 


/o40XF2B 4 73 


6776 


790 


2576 " 


43 62 




TOji/iT Pi* - * n a —t A 

/o4tIrZB^4 74 


6796 


791 


2577 


43 63 


6149 


784CIP2B_^4 75 


6798 


792 


. 2578 


43 64 


6 150 


784CIP2B 476 


6823 


793 


2579 


43 65 


Cl CI 


784CIP2B 477 


6825 


794 


2580 


4366 


6152 


784CIP2B 478 


6826 


795 


2581 


4367 


6153 


784CIP2B_479 


6839 


796 


2582 


4368 


6154 


784CIP2B 480 


6844 


797 


2583 


4369 


6155 


784CIP2B 482 


6849 


798 


2584 


4370 


6156 


784CIP2B 483 


6854 


799 


2585 


4371 


6157 


784CIP2B 484 


6857 


800 


2586 


4372 | 


6158 


784CIP2B 485 


6861 


801 


2587 


4373 


6159 


784CIP2B 486 


6873 


802 


2588 


4374 


6160 


784CIP2B 487 


6875 " 


803 


2589 | 4375 


6161 


784CIP2B_488 


6877 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of con tig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




804 


2590 


4376 


6162 


784CIP2B_489 


6880 


805 


2591 


4377 


6163 


7B4CIP2B_490 


6885 


806 


2592 


4378 


6164 


784CIP2B_491 


6890 


807 


2593 


4379 


6165 


784CIP2B 492 


6890 


808 


2594 


4380 


6166 


7S4CIP2B 493 


6894 


809 j 


2595 


4381 


6167 


7 84CIP2B_494 


6901 


810 


259S 


4382 


6168 


784CIP2B_4 95 


6904 


811 


2597 


4383 


6169 


784CIP2B__4 96 


6907 


812 


2598 


4384 


6170 


7 84CIP2B__497 


6914 


813 


2599 


4385 


6171 


784CIP2B_498 


6917 


814 


2600 


4386 


6172 


784CIP2B_499 


6923 


815 


2601 


4387 


6173 


784CIP2B__S00 


6929 


816 


2602 


4388 


6174 


784CIP2B 501 


6931 


817 


2603 


4389 


6175 


784CIP2B_502 


6935 


818 


2604 


4 3 90 


6176 


7B4CIP2B__503 


6940 


819 


2605 


4391 


5177 


784CIP2B_504 


6945 


820 


2606 


4392 


6178 


784CIP2B_505 


6946 


B21 


2607 


4393 


6179 


7 84CIP2B_506 


6947 


822 


2608 


43 94 


6180 


784CIP2B_507 


694 9 


823 


2609 


4395 


6181 


784CIP2B_508 


6959 


1 824 


2610 


4396 - 


6182 


784CIP2B_509 


6960 


825 


2611 


4397 


6183 


784CIP2B_510 


6962 


826 


2612 


4398 


6184 


784CIP2B_511 


6963 


827 


2613 


4399 


6185 


784CIP2B_512 


6967 


828 


2614 


4400 


6186 


784CIP2B_513 


6983 


829 


2615 


4401 


6137 


784CIP2B_514 


6988 


830 


2616 


4402 


61B8 


784CIP2B_515 


6996 


831 


2617 


4403 


6139 


784CIP2B_516 


7003 


832 


2618 


4404 


6190 


784CIP2B_517 


7016 


833 


2619 


4405 


6191 


784CIP2B_518 


7017 


834 


2620 


4406 


6192 


784CIP2B 519 


7025 


835 


2621 


4407 


6193 


784CIP2B__520 


7025 


836 


2622 


4408 


6194 


784CIP2B_521 


7025 


837 


2623 


4409 


6195 


784CIP2B 522 


7050 


838 


2624 


4410 


6196 


784CIP2B_523 


7051 


839 


2625 


4411 


6197 


784CIP2B__524 


7055 


840 


2626 


4412 


6198 


784CIP2B_525 


7060 


841 


2627 


4413 


6199 


784CIP2B_526 


7064 


842 


2628 


4414 


6200 . 


784CIP2B_527 


7067 


843 


2629 


4415 


6201 


784CTP2B__528 


7071 


844 


2630 


4416 


6202 


784CIP2B_529 


7072 


845 


2631 


4417 


6203 


784CIP2B 530 


7073 


846 


2632 


4418 


6204 


784CIP2B_531 


7076 


847 


2633 


4419 


6205 


784CIP2B_532 


7088 


84 8 


2634 


4420 


6206 


784CIP2B_533 


7089 


849 


2635 


4421 


6207 


784CIP2B_534 


7091 


850 


2636 


4422 


6208 


784CIP2B_535 


7091 


851 


2637 


4423 


6209 


784CIP2B_536 


7104 


852 


2638 


4424 


6210 


784CIP2B_537 


7105 


853 


2639 


4425 


6211 


784CIP2B_538 


710S 


854 


2640 


4426 


6212 


784CIP2B_539 


7109 


855 


2641 


4427 


6213 


7B4CIP2BJ540 


7109 


856 


2642 


4428 


6214 


784CIP2B_541 


7119 


857 


2643 


4429 


6215 


784CIP2B_542 


7120 


858 


2644 


4430 


6216 


784CIP2B_543 


7121 


859 


2645 


4431 


6217 


784CIP2B_544 


7126 


860 


2646 


4432 


6218 


784CIP2B_545 


7127 


861 


2647 


4433 


6219 


784CIP2B_546 


7130 


862 


2648 


4434 


6220 


784CIP2B_547 


7131 


863 


2649 


4435 


6221 


784CIP2B_548 


7144 


864 


2650 


4436 


6222 


784CIP2B 549 


7159 


865 


2651 


4437 


6223 


784CIP2B 550 


7163 
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BNSDOCID: <WO. 



_0153312A1_1_> 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 
peptide 


Priori ty 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO z in 
U.S. S.N. 
09/488,725 


866 


2652 


4438 


6224 


784CIP2B 551 


7175 


867 


2653 


4439 


6225 


784CIP2B 552 


7188 


868 


2654 


4440 


6226 


784CIP2B 553 


7189 


869 


2655 


4441 


6227 


784C1P2B_554 


7190 


870 


2656 


j 4442 


6228 


784CIP2B 555 


7191 


871 


2657 


4443 


6229 


784CIP2B_556 


7203 


872 


2658 


4444 


6230 


784CIP2B_557 


7204 


o /S 


2659 


4445 


6231 


784CIP2B 558 


7208 


874 


2660 


4446 


6232 


784CIP2B_559 


7209 


875 


2661 


I 4447 


6233 


784CIP2B_ 560 


7210 


876 


2662 


4448 


6234 


784CIP2B_561 


7216 


877 


2663 


4449 


6235 


784CIP2B_562 


7221 


878 


2664 


4450 


6236 


784CIP2B 563 


7230 


879 


2665 


4451 


6237 


784CIP2B_564 


7237 


880 


2666 


4452 


6238 


784CIP2B_565 


7240 


881 


2667 


4453 


6239 


784CIP2B 566 


| 7245 


882 


2668 


4454 


6240 


7 84CIP2B_567 


7250 


883 


2669 


4455 


6241 


784CIP2B_568 


7251 


884 


2670 


4456 


6242 


784CIP2B 569 


7255 


885 


2671 


4457 


6243 


784CIP2B 570 


7260 


886 


2672 


4458 


6244 


784CIP2B 571 


7265 


887 


2673 


4459 


6245 


784CIP2B_572 


7268 


888 


2674 


4460 


624 6 


784CIP2B 573 


7275 


889 


2675 


4461 


6247 


784CIP2B 574 


7279 


890 


2676 


4462 


6248 


784CIP2B_575 


7283 


891 


2677 


4463 


6249 


784CIP2B_576 


7283 


892 


2678 


4464 


6250 


784CIP2B 577 


7287 


893 


2679 


4 4 65 


6251 


784CIP2B 578 


7301 


894 


2680 


4466 


6252 


784CIP2B_579 


7308 


89S 


2681 


4467 


6253 


784CIP2B 580 


7308 


896 


2682 


4468 


6254 


784CIP2B_581 


7309 


897 


2683 


4469 


6255 


784CIP2B_582 


7319 


898 


2684 


4470 


6256 


784CIP2B_S83 


7320 j 


899 


2685 


4471 


6257 


784CIP2B_584 


7326 


900 


2686 


4472 


6258 


784CIP2B_585 


7326 


901 


2687 


4473 


6259 


784CIP2B 586 


7334 


902 


2688 


4474 


6260 


784CIP2B_587 


733 7 


903 


2689 


4475 


6261 


784CIP2B 588 


733 9 


904 


2690 


4476 


6262 


784CIP2B_589 


7344 


905 


2691 J 


4477 


'6263 


784CIP2B_590 


7355 


906 


2692 


4478 


6264 


784CIP2B_S9l 


7363 


907 


2693 


4479 


6265 


784CIP2B 592 


7363 


908 


.2694 


4480 


6266 


784CIP2B 593 


7365 


909 


2695 


4481 


6267 


784CIP2B 594 


7368 


oTfi 


2696 


4482 


6268 


784CIP2B 595 


7369 


911 


2697 


4483 


6269 


784CIP2B_596 


7372 


912 


2698 


4484 


6270 


784CIP2B_599 


7375 




2699 


4485 


6271 


784CIP2B_600 


7381 


914 


2700 


4486 


6272 


784CIP2B 601 


7383 


QIC 

915 


2701 


4487 


6273 


784CIP2B_602 


7387 


916 


2702 


4488 


6274 


784CIP2B 603 


7391 


917 


2703 


4489 


6275 


784CIP2B_604 


7393 j 


918 


2704 




6276 


784CIP2B 605 


7395 


919 


2705 


4491 


6277 


704CIP2B 606 


7397 


920 


2706 


4492 j 


6278 


7B4CIP2B__607 


7399 


921 


2707 


4493 


6279 


784CIP2B 608 


7405 


922 


2708 


4494 


6280 | 


784CIP2B 609 


7406 


923 


2709 


4495 


6281 


784CIP2B_610 


7406 


924 


2710 


4496 


6282 


784CIP2B 611 


7409 


925 


2711 


4497 


6283 


784CIP2B 612 


7410 


926 


2712 


4498 


6284 


784CIP2B 613 


7411 


927 


2713 


4499 


6285 


784CIP2B 614 


7417 
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BNSDOCID: <WO 0153312A1 J_> 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


SEQ ID 


S3Q ID NO: 


SEQ ID j 


Priority 


SEQ ID 


of full- 


NO: of 


of con tig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




92 8 


2714 


4500 


6286 


784CIP2B_615 


7418 


929 


2715 


4501 


6287 


784CIP23_6l6 


7421 


930 


2716 


4502 


6288 


784CIP2B__617 


7422 


931 


2717 


4503 


6289 


784CIP23_6l8 


7422 


932 


2718 


4504 


6290 


784C1P2B_619 


7423 


933 


2719 


4 505 


6291 


784CIP23620 


7424 


934 


2720 | 


4506 


6292 


784CIP23621 


7426 


935 


2721 


4507 


6293 


784CI P23_622 


7427 


936 


2722 


4508 


6294 


784CIP23_623 


742B 


93 7 


2723 


4509 


6295 


784CIP23_624 


74 30 


938 


2724 


4510 


6296 


784CIP23_625 


7435 


939 


2725 


4511 


6297 


784CIP2B_626 


743 7 


940 


2726 


4512 


6298 


784CIP2B_627 


7439 


941 


2727 


4513 


6299 


784CIP2B_628 


7440 


j 942 


2728 


4514 


6300 


784CIP23_629 


7442 


943 


2729 


4515 


6301 


784CIP2B 630 


7450 


944 


2730 


4516 


6302 


784CIP23_631 j 


7451 


945 


2731 


4517 


6303 


784CIP2B_632 


7452 


946 


2732 


4518 


6304 


784CIP23_633 


7454 


947 


2733 


4519 


6305 


784CIP2B634 


7457 


948 


2734 


4520 


6306 


784CIP2B_635 


7459 


949 


2735 


4521 


6307 


784CIP2B_636 


7461 


950 


2736 


4522 


6308 


784CIP2B_63 7 


7463 


951 


2737 


4523 


6309 


784CIP2B_63 8 


7466 


952 


2738 


4 524 


6310 


784CIP2B_639 


7469 


953 


2739 


4525 


6311 


784CIP23 64 0 


74 73 


954 


2740 


4526 


6312 


784CIP2B_641 


7481 


955 


2741 


4527 


6313 


784CIP2B_642 


7482 


956 


2742 


4528 


6314 


784CIP2B643 


7482 


957 


2743 


4529 


6315 


784CIP2B_644 


7483 


958 


2744 


4530 


6316 


784CIP2B_645 


7485 


959 


2745 


4531 


6317 


784CIP2B_646 


7486 


960 


2746 


4532 


6318 


784CIP2B_647 


7487 


961 


2747 


4533 


6319 


784CIP2B64 8 


7491 


962 


2748 


4534 


6320 


j 784CIP23_649 


7492 


963 


2749 


4 53 5 


6321 


784CIP2B_650 


7494 


964 


2750 


4536 


6322 


784CIP23_651 


7498 


965 


2751 


4537 


6323 


784CIP2B652 


7504 


966 


2752 


453 8 


6324 


] 784CIP23_653 


7508 


967 


2753 


4539 


6325 


784CIP2B_654 


7516 


96^8 


2754 


4540 


6326 


784CIP2B_655 


7518 


969 


2755 


4541 


6327 


784CIP2B_656 


7519 


970 


2756 


4542 


6328 


784CIP2B_657 


7521 


971 


2757 


4543 


6329 


784CIP23_658 


7529 


972 


2758 


4544 


6330 


784CIP2B_659 


7532 


973 


2759 


4545 


6331 


784CIP23_660 


7533 


974 


2760 


4546 


6332 


784CIP2B_66l 


7535 


975 


2761 


4547 


6333 


784C1P2B_662 


7545 


976 


2762 


4548 


6334 


784CIP2B_663 


7546 


977 


2763 


4549 


6335 


784CIP2B664 


7552 


978 


2764 


4550 


6336 


784CIP2B_665 


7554 


979 


2765 


4551 


6337 


784CIP2B666 


7567 


980 


2766 


4552 


6338 


784CIP23_667 


756 9 


981 


2767 


4553 


6339 


784CIP2B_668 


7575 


982 


2768 


4554 


6340 


784CIP23_669 


7576 


983 


2769 


4555 


6341 


784CIP23_670 


7577 


984 


2770 


4556 


6342 


784CIP2B_671 


7579 


985 


2771 


4557 


6343 


784CIP23 672 


7582 


986 


2 772 


4558 


6344 


784CIP2B_673 


7587 


987 


2773 


4559 


6345 


784CIP23_674 


7589 


988 


2774 


4560 


6346 


784CIP2B 675 


7597 


989 


2775 


4561 


6347 


784CIP2B 676 


7597 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO; 
of full- 
length 

nil/-**] onh H 


SEQ ID 
NO: of 
full- 
length 

jucpciae 
SPCJi ipnrp 

^ IhI CSX J C2 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 




Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


990 


2 776 


4562 


634 8 


784CIP2B_677 


7609 


991 


2777 


4563 


634 9 


784CIP23_678 


7609 


992 


2778 


4564 


63 50 


' 784CIP2B 679 


. 7609 


993 


2 779 


4565 


6351 


784CIP2B 680 


7613 


994 


2 780 


4566 


6352 


784CIP23 681 


7623 


99S 


2781 


4567 


6353 


784CIP23 682 


7629 


996 


2732 


4568 


63 54 


784CIP2B 683 


7630 


997 


2783 


4 569 


cTiTc: 


784CIP2B 684 


• 7633 


998 


2784 


4570 


6356 


784CIP2B 685 


7635 


999 


2785 


4 571 


6357 


784CIP2B 686 


7638 


1000 


2786 


4572 


6358 


784CIP2B 687 


7639 


1001 


2787 


4573 


6359 


784CIP2B 688 


7646 


1002 


2 788 


4 574 


6360 


784CIP2B 689 


7647 


1003 


^ 'US 


4575 


6361 


784CIP2B_690 


7648 


1004 


2 790 


4576 


6362 


784CIP2B_691 


7658 


1005 


/ <31 


4577 


6363 


784CIP2B 692 


7664 


1006 


2792 


4578 


6364 


784CIP2B 693 


7664 


1007 


2 793 


4579 


6365 


784CIP2B 695 


7674 • 


10 08 


2 794 


4580 


6366 


784CIP2B_696 


7675 




2 795 


4581 


6367 


784CIP2B 697 


7676 


1010 


2796 


4582 


6368 


784CIP2B 698 


7681 


1011 


2797 


4583 


6369 


784CIP2B_699 


7688 


1012 


2798 


4584 


6370 


784CIP2B 700 


7693 


1013 


2 799 


4585 


6371 


784CIP2B 701 


7694 




2 800 


4S86 


6372 


784CIP2BJ702 


7715 


1015 


2 801 


4587 


6373 


784CIP2B 703 


7716 


1016 


2 802 


4588 


6374 


784CIP2B 704 


7718 


-LUX/ 


2803 


45B9 


6375 


784CIP2B_705 


7721 


T77T-5 


2 804 


4590 


6376 


784CIP2B_706 


7723 


1019. 


2 805 


4591 


6377 


784CIP2B 707 


7729 


1020 


2 806 


4592 


6378 


784CIP2B 708 


7733 


1021 


2807 


4593 


6379 


784CIP2B 709 


7735 


m?? 

X\J £. & 


2808 


4594 


6380 


784CIP2B 710 


7741 


1023 


28 09 


4595 


6381 


784CIP2B_711 


7743 


1024 


2810 


4596 


6382 


784CIP2B 712 


7748 


102 5 


2811 


4597 


6383 


784CIP2B_713 


7749 


1026 


a XZ. 


4 598 


63B4 


784CIP2B 714 


7750 


102 7 


2813 


4599 


6385 


784CIP2B_715 


7757 


1028 




4 600 


6386 


784CIP2B_716 


7759 


1029 


2815 


• 4601 


6387 


784CIP2B_717 


7760 


1030 


2816 


4602 


6388 


784CIP2B 718 


7760 


1031 


2817 


4 6 03 


6389 


784CIP2B 719 


7764 


1032 


2818 


4604 


6390 


784CIP2B 720 


7765 


1033 


2819 




6391 


784CIP2B 721 


7766 


1034 


2820 


acne: 


6392 


784CIP2B 722 


7767 


1035 


2821 




ci'd-j 


784CIP2B 723 


7769 


1036 


2822 


4 608 


6394 


784CIP2B 724 


7770 


1037 


2823 1 


A C Ci Q 


6395 


784CIP2B 725 


7774 


1038 


2824 


act n 


6396 


784CIP2B_726 


7779 


103 9 


2825. 


^ Oil 


63 97 


784CIP2B 727 


7781 


1040 


2826 


4612 


6398 


784CIP2B 728 


7782 


1041 


2827 


4613 


6399 


784CIP2B 729 


7783 


1042 


2828 


4614 


6400 


784CIP2B 730 ~ 


7787 


1043 


2829 


4615 


6401 


784CIP2BJ731 


7792 


1044 


2830 


4616 


6^02 


784CIP2B 732 


7795 


1045 


2831 


4617 


6403 


784CIP2B_733 j 


7801 


1046 


2832 


4618 


6404 


784CIP2B_734 


7807 


1047 


2833 


4619 


6405 


784CIP29 735 


7808 


1048 


2834 


4620 


6406 


784CIP2B 736 


7819 


1049 


2835 


4621 


6407 ; 


784CIP2B 737 


7824 


1050 


2836 


4622 


6408 


784CIP2B 738 


7826 


1051 


2837 


4623 


6409 


7B4CIP2B 739 


7829 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority- 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


correspond! ng 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide . 




sequence 


priority 






sequence 






application 




1052 


^ojy 


4 624 


64 10 


784CIP2B__74 0 


7832 


1053 


2839 


4 625 


6411 


784CIP2B_741 


7839 


1054 


2840 


4 626 


6412 


784CIP2B 743 


7847 


1055 


2841 


4627 


6413 


784CIP2B_744 


7848 


1056 


2-642 


4628 


6414 


784CIP2B 745 


7853 


n n C7 
xu o / 


2843 


4 629 


6415 


784CIP2B 746 


7854 




2844 


4 63 0 


6416 


784CIP2B_74 7 


7856 


1053 


2845 


4 631 


6417 


784CIP2B_74B 


7862 


1060 


2846 


4632 


6418 


784CIP2B 749 


7865 


1061 


2847 


4633 


6419 


784CIP2B 750 


7874 


1062 


2848 


4634 


6420 


784CIP2B_751 


7877 


1063 


2849 


4635 


6421 


784CIP2B 752 


7880 


1064 


2850 


4636 


6422 


784CIP2B 753 


7882 


1065 


2851 


4637 


6423 


784CIP2D 754 


7884 


1066 


2852 


4638 


6424 


784CIP2B_755 


7886 


1067 


2853 


4639 


6425 


784CIP2B_756 


7888 


1068 


2854 


4640 


6426 


784CIP2B_757 


7889 


1069 


2855 


4641 


6427 


784CIP2B_758 


7901 


1070 


2856 


4642 


6428 


784CIP2B_759 


7910 


1071 


2857 


4643 


6429 


784CIP2B_760 


7911 


1072 


2858 


4644 


6430 


784CIP2B_761 


7921 


1073 


2859 


4645 


6431 


784CIP2B 762 


7923 


1074 


2860 


4646 


6432 


704CIP2B_763 


7924 


1075 


2861 


4647 


6433 


784CIP2B_764 


7925 


1076 


2862 


4648 


6434 


784CIP2B_765 


7928 


1077 


2863 


4649 


6435 


784CIP2B_766 


7929 


1078 


2864 


4650 


6436 


784CIP2B_767 


7930 


1079 


2865 


4651 


6437 


784CIP2B 768 


7934 


1080 


2366 


4652 


6438 


784CIP2B_769 


7938 


1081 


2867 


4653 


6439 


784CIP2B 770 


7942 


10 82 


2868 


4654 


6440 


784CIP2B 771 


7945 


1083 


2869 


4655 


6441 


784CIP2B_772 


7946 


10B4 


2870 


4656 


6442 


784CIP2B_773 


794 8 


1085 


2871 


4657 


6443 


784CIP2B_774 


7951 


1086 


2872 


4658 


6444 


784CIP2B 775 


7952 


1087 


2873 


4659 


6445 


784CIP2B 776 


j 7953 


1088 


2874 


4660 


6446 


7 84CIP2B_777 


7954 


1089 


2875 


4661 


6447 


784CIP2B_77B 


7957 


1090 


2876 


4662 


6448 


784CIP2B_779 


7958 


1091 


2877 


4663 


5449 


784CIP2B 780 


7961 


1092 


2878 


4664 


6450 


784CIP2B_7B1 


7965 


1093 


2879 


4665 


6451 


7 84CIP2B_782 


7966 


1094 


2880 


4666 


6452 


784CIP2B_783 


7979 


1095 


2881 


4667 


6453 


784CIP2B_7B4 


7986 


1096 


2882 


4668 


6454 


784CIP2B 785 


7986 


1097 


2883 


4669 


6455 


784CIP2B_786 


7988 


1098 


2884 


4670 


6456 


784CIP2B_787 


7991 




2885 


4671 


6457 


784CIP2B_78B 


7992 


1100 


2886 


4672 


6458 


784CIP2B_789 


7992 


1 1 m 

11U1 


2887 


4673 


6459 


784CIP2B 790 


7992 


1102 


2888 


4674 


6450 


784CIP2B 791 


7992 


Vina 


2889 


4675 


6461 


784CIP2B 792 


8003 


1104 


2890 








8014 


1105 


2891 


4677 


6463 


784CIP2B_794 


8 015 


1106 


2892 


4678 


6464 


784CIP2B 795 


8016 


1107 


2893 


4679 


6465 


784CIP2B_796 


8017 


1108 


.2894 ' 


4680 


6466 


784CIP2B 797 


8019 


1109 


2895 


4681 


6467 


784CIP2B_798 


8020 


1110 


2896 


4682 


6463 


784CIP2B 799 


8022 


1111 


2897 


4683 


6469 


784CIP2B 800 


8022 


1112 


2898 


4684 


6470 


784CIP2B_801 


8028 


1113 


2B99 


4685 


6471 


784CIP2B_802 


8030 



288 



BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



PCT/US00/34263 



z>c.%2 XJJ NO z 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO : of 

peptide 
sequence 


SEQ ID NO; 
of con tig 
nucleotide 
sequence 


SEQ TD 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
a. S.S.N. 
09/488,725 


1114 


2900 


4686 


O H f £ 


784CIP2B 803 


8038 


1115 


2901 


4687 


64 73 


784CIP2B 804 


8042 


1116 


2902 


4 688 


64 74 


784CIP2B 805 


8045 


1117 


2903 


4 689 


64 75 


784CIP2B 806 


8045 


1118 


2904 


4690 


64 76 


784CIP2B 807 


8046 


1119 


2905 


4691 




784CIP2B 808 


8047 


1120 


2906 


4692 


64 78 


784CIP2B_809 


8051 


1121 


2907 


4 693 


64 7 9 


784CIP2B 810 


8059 


1122 


2908 


4694 


64 8 0 


784CIP2B 811 


8064 


1123 


2909 


4695 


64 8 1 


784CIP2B 812 


8069 


1124 


2910 




6482 


784C1P2B 813 


8074 


1125 


2911 


4 697 


64 83 


784CIP2B_814 


8077 


1126 


2912 


4 698 


6484 


784CIP2B 815 


8078 | 


1127 


2913 


4 699 


6485 


784CIP2B_816 


8079 


1128 


2914 


4700 


6486 


784CIP2B 817 


i 8084 


1129 


2915 


4701 


6487 


784CIP2B 818 


j 8088 


1130 


2916 


4 702 


648 8 


784CIP2B 819 


( 8090 


1131 


2917 


4703 


6489 


784CIP2B 820 


8091 


1132 


2918 


4704 


6490 


784CIP2B 821 


8099 j 


1133 


2919 


4705 


6491 


784CIP2B 822 


8099 


1134 




4706 


6492 


784CIP2B 823 


8100 


1135 




4707 


6493 


784CIP2B 824 


8102 


1136 


y £ Z. 


4708 


6494 


784CIP2B_825 


8103 


1137 


292 3 


4709 


6495 


784CIP2B 826 


8103 


1138 




4710 


6496 


784CIP2B 827 


8104 


1139 




4711 


6497 


784CIP2B 828 


8108 


1140 


-z y £ b 


4712 


6498 


784CIP2B_829 


8110 


1 114 1 


2927 


4713 


6499 


784CIP2B 830 


8116 


114 2 


2923 


4714 


6500 


784CIP2B 831 


8117 


1143 




4715 


5501 


784CIP2B 832 


8123 


1144 


Tain 

A 2? 3 U 


4716 


S502 


784CIP2B 833 


8130 


114 5 




4 717 


6503 


784CIP2B_834 


8130 


1146 




4718 


6 504 


784CIP2B 835 


8143 \ 


1147 


2933 


4719 


6505 


784CIP2B 836 


8143 


1148 


2934 


4 720 


6506 


784CIP2B 837 


8154 


1149 


2935 


4 721 


6507 


784CIP2B_838 


8155 


1150 


293 6 


4722 


6508 


784CIP2B 839 


8162 


1151 




4723 


6509 


784CIP2B_840 


8163 


1152 


293g 


4 724 


6510 


784CIP2B 841 


8172 


1153 


2939 


*k / *£D 


6511 


784CIP2B 842 


8173 


1154 


294 0 


4726 


6512 


784CIP2B 843 


8179 


1155 


2941 


4 727 


6513 


784CIP2B 844 


B182 


1156 


2942 


A "7 O Q 


6514 


784CIP2B 845 


8183 


2157 


2943 


4 729 


6515 


784CIP2B 846 


8184 


1158 


2944 


4730 


6516 


784CIP2B 847 


8185 


1159 


2945 


4 731 


6517 


784CIP2B 848 


8187 


1160 


2S46 


4 732 


6518 


784CIP2B 849 


8188 


1161 


2947 


4733 


6519 


784CIP2B_850 


8190 


1162 


2948 


4 734 


6520 


784CIP2B 851 


8190 


1163 


2949 


4 735 




784CIP2B 852 


8192 


1164 


2950 


4 736 


6522 


784 CI P2B 853 


8193 


1165 


2951 


4 737 


652 3 


784CIP2B 854 


8197 


1166 


2952 


4738 


6524 


784CIP2B 855 


8197 


1167 


2953 


4739 


6525 


784CIP2B 856 


8199 


1168 


2954 


4740 


6526 


784CIP2B 857 


8202 


1169 


2955 


4741 


6527 


784CIP2B 858 


8203 


1170 


2956 


4742 1 


6528 


7B4CIP2B 859 


8208 


1171 


2957 


4743 


. 6529 


784CIP2B 860 


8209 


1172 


2958 


4744 


6530 


784CIP2B 861 


8211 j 


1173 


2959 


4745 


6531 


784CIP2B 862 


8214 j 


1174 


2960 


4746 


6532 


784CIP2B 863 


8217 


1175 


2961 


4747 


6533 


784CIP2B 864 


8223 
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SEQ ID NO: 


SEQ ID 


S5rf> in no- 


SEQ ID 


13 yi f»T" i t~v 

rl _1_ lw* J_ J. «— jf 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number 


NO : in 


length 


full- 


nucleot ide 


of contig 


corresponding 


U-S .S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


pept ide 
sequence 




sequence 


priority 
application 




1176 


.2962 


4748 


6534 


784CIP2B_865 


8224 


1177 


2963 


4749 


6535 


7 84CIP2B_866 


8226 • 


1178 


2964 


4750 


6536 


784CIP2B_867 


8227 


1179 


2965 


4751 


6537 


784CIP2B_86B 


8229 


1180 


2966 


4752 


6538 


784CIP2B_869 


8232 


1181 


2967 


4753 


6539 


784CIP2B_870 


8236 


1182 


2968 


4754 


6540 


784CIP2B_871 


8239 


1103 


2969 


4755 


6541 


784CIP2B_872 


8244 


1184 


2970 


4756 


6542 


784CIP2B_8 73 


8245 


1185 


2971 


4757 


6543 


784CIP2B_874 


8248 


1186 


2972 


4758 


6544 


784CIP2B 875 


8251 


1187 


2973 


4759 


6545 


784CIP2B 876 


8253 


1188 


2974 


4760 


6546 


784CIP2B_877 


8260 


1189 


2975 


4761 


6547 


784CIP2B 878 


8262 


1190 


2976 


4762 


6548 


784CIP2B_879 


8268 


1191 


2977 


4763 


6549 


784CIF2B_880 


8270 


1192 


2978 


4764 


6550 


784CIP2B_88l 


8272 


1193 


2979 


4765 


6551 


784CIP2B 882 


8274 


1194 


2980 


4766 


6552 


784CIP2B_883 


8274 


1195 


2981 


4767 


6553 


784CIP2B 884 


B275 


1196 


2982 


4768 


6554 


784CIP2B_885 


8277 


1197 


2983 


4769 


6555 


784CIP2B 886 


8281 


1198 


2984 


4770 


6556 


784CIP2B_887 


8283 


1199 


2985 


4771 


6557 


784CIP2B_B88 


8289 


1200 


2986 


4772 


6558 


784CIP2B 889 


8295 


1201 


2987 


4773 


6559 


784CIP2B 890 


8300 


1202 


2988 


4774 


6560 


784CIP2B 891 


8303 


1203 


2989 


4775 


6561 


784CIP2B 892 


8304 


1204 


2990 


4776 


6562 


784CIP2B 893 


8305 


1205 


2991 


4777 


6563 


784CIP2B 894 


8309 


1206 


2992 


4 778 


6564 


784CIP2B 895 


8318 


1207 


2993 


4779 


6565 


784CIP2B 896 


8319 


1208 


2994 


4780 


6566 


784CIP2B 897 


8321 


1209 


2995 


4781 


6567 


784CIP2B_8 98 


8322 


1210 


2996 


4782 


6568 


784CIP2B 899 


8323 


1211 


2997 


4783 


6569 


784CIP2B_900 


8325 


1212 


2998 


4784 


6570 


7 84CIP2B_901 


8331 


1213 


2999 


4785 


6571 


784CIP2B_902 


8332 


1214 


3000 


4786 


6572 


784CIP2B_903 


8333 


1215 


3001 


4787 


6573 


784CIP2B_904 


8335 


1216 


3002 


4788 


6574 


784CIP2B_905 


8336 


1217 


3003 


4789 


6575 


784CIP2B_906 


8337 


1218 


3004 


4790 


6576 


784CIP2B_907 


8340 


1219 


3005 


4791 


6577 


784CIP2B_908 


8343 


1220 


3006 


4792 


6578 


784CIP2B_909 


8347 


1221 


3007 


4793 


6S79 


784CIP2B_910 


8349 


1222 


3008 


4794 


6580 


784CIP2B_911 


8351 


1223 


3009 


4795 


6581 


784CIP2B_912 


8353 


1224 


3010 


4796 


6582 


784CIP2B_913 


8355 


1225 


3011 


4797 


6583 


784CIP2B_914 


8361 


1226 


3012 


4798 


6584 


784CIP2B_915 


8365 


1227 


3013 


4799 


6585 


784CIP2B_916 


8367 


1228 


3014 


4800 


6586 


784CIP2B_917 


8369 


1229 


3015 


4801 


6587 


784CIP2B_919 


8375 


1230 


3016 


4802 


6588 


784CIP2B_920 


8387 


1231 


3017 


4803 


6589 


784CIP2B_921 


8391 


" 1232 


3018 


4804 


6590 


784CIP2B_922 


8393 


1233 


3019 


4805 


6591 


784CIP2B_923 


8393 


1234 


3020 


4806 


6592 


784CIP2B 924 


| 8394 


1235 


3021 


4807 


6593 


784CIP2B_925 


8395 


1236 


3022 


4808 


6594 


784CIP2B_926 


8396 


1237 


3023 ' 


4809 


6595 


784CIP2B_927 


8398 
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otij ±t) NO: 
r»F ful 1 - 

lencf th 

nucleotide 

sequence 


SEQ ID 
NO : of 
lul X — 

1 e ncf t h 

peptide 

sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


j Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1238 


3024 


4810 


6596 


784CIP2B 928 


8402 


| 1239 


3025 


4811 


ODJ f 


784CIP2B 929 


i 8402 


1240 


3026 


4 812 


6598 


784CIP2B 930 


8405 


1241 


3027 


4813 


65 99 


784CIP2B 931 


8406 


1242 


3028 


4814 


66 0 0 


784CIP2B 932 


8409 


1243 


3029 


4815 




784CIP2B 933 


8410 


1244 


3030 


4 816 


rem 
o b VJ jL 


784CIP2B 934 


8414 . 


1245 


3 031 


4817 


ccr\~i 


784 CIP2B 93 5 


8415 


1246 


3032 


4818 


boU4 


784CIP2B 936 


8419 


1247 


3033 


4 819 


6605 


784CIP2B 937 


84 26 


1246 


3034 


4820 


66 06 


7B4CIP2B 938 


8430 


1249 


3035 




6607 


784CIP2B 939 


8431 


1250 


3036 




6608 


784CIP2B 940 


8432 


1251 


303 7 


4 823 


6609 


784CIP2B 941 


8433 


1252 


3 03 8 


4 824 


6610 


784CIP2B 942 


8434 


1253 


3 039 


4 825 


6611 


784CIP2B 943 


8438 


1254 


3 040 


4 626 


6612 


784CIP2B 944 


8439 


12S5 


3 041 


4 827 


6613 


784CIP2B 945 


8441 


1256 


O U*l £, 


4 828 


6614 


784CIP2B 946 


8450 


1257 


3 043 


4 829 


6615 


784CIP2B_94 7 


8451 


1258 


3 044 


4830 


6616 


784CIP2B_94B 


8452 


1259 


"\ ft A 


4 831 


6617 


784CIP2B 949 


8460 


1260 




4 832 


6618 


784CIP2B_950 


8461 


1261 




4 833 


6619 


784CIP2B 951 


8462 


12 62 


3048 


4834 


6620 


784CIP2B_952 


8464 * 


1263 


"X ft A Q 


4 835 


6621 


784CIP2B 953 


8465 


1264 


3 050 


4 836 


6622 


784CIP2B_954 


8467 


1265 


3051 


4 837 


6623 


784CIP2B 955 


8470 


12 66 


3052 


4 838 


6624 


784CIP2B_956 


8471 


1267 


3 053 


4839 


6625 


784CIP2B 957 


8473 


1268 


3054 


4 840 


6626 


784CIP2B_958 


8474 


1269 




4 841 


6627 


784CIP2B_959 


8475 


1270 


3056 


4 842 


6628 


784CIP2B 960 


8476 


1271 


3057 


4843 


6629 


784CIP2B_961 


8480 


1272 


3058 


4 84 4 


6630 


7B4CIP2B_962 


8482 


1273 


3059 


4 845 


6631 


784CIP2B_963 


8482 


1274 


3 060 


4 84 6 


6632 


784CIP2B_964 


8486 


1275 


3061 


4 84 7 


6633 


784CIP2B 965 


8488 


1276 


3062 


4 848 


6634 


784CIP2B 966 


8492 


1277 


3063 


484 9 


6635 


7 84CIP2B_96 7 


8494 


1278 


3064 


4 850 


6636 


784CIP2B_968 


8496 


1279 


3065 


4 851 


6637 


784CIP2B 969 


8497 


1280 • 


3066 


4852 


CCXQ ~ 

bbJo 


784CIP2B 970 


8499 


1281 


3067 


4853 


6633 


784CIP2B_971 


8513 


1262 


3068 


4 8 54 


6640 


784CIP2B_972 


8522 


1283 


3069 


4855 


CCA ~t 

boll 


784CIP2B 973 


8526 


1284 


3070 


4856 




/84C1P2B 974 


8531 


1285 


3071 


4857 


fa t>4 J 


784CIP2B_975 


8533 


1286 


3072 


4858 


CCAA ' 

o b 4 4 


784CIP2B 976 


8542 


1287 


3073 


4 o c q 


6645 


784CIP2B 977 


8544 


1288 


3074 


4860 


6646 | 


784CIP2B 978 


8565 


1289 


3075 


4 861 


6647 


784CIP2B 979 


8565 


1290 


3076 


4862 


6648 


784CIP2B 980 


8572 ' 


1291 j 


3077 


4863 


6649 


764CIP2B 981 


8576 


1292 


3078 


4864 


6650 


784CIP2B_982 


8578 


1293 


3079 


4865 


6651 


784CIP2B 983 


8584 


1294 


3080 


4866 


6652 


784CIP2B 984 


8598 


1295 


3081 


4867 


6653 


784CIP2B 985 


8602 


1296 


3082 


4868 


6654 


784CIP2B_986 


8604 


1297 


3083 


4869 


6655 


784CIP2B_987 


8609 


1298 


3084 


4870 


6656 


784CIP2B 988 


8612 


1299 


3085 


4871 


6657 


784CIP2B 989 


8637 
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SEQ ID NO: 
of full- 


SEQ ID 
NO: of 


SEQ ID NO: 
of contig 


SEQ ID 
NO: 


Priority 
docket number_ 


SEQ ID 
NO : in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1300 


3086 


4872 


6658 


704CIP2B_990 


8640 


1301 


3087 


4 873 


6659 


784CIP2B_991 


8643 


1302 


3088 


4 874 


6660 


784CIP2B_992 


8645 


1303 


3089 


4875 


6661 


7 84CIP2B_993 


8650 


13 04 


3090 


4 876 


6662 


784CIP2B_994 


8651 


1305 


3091 


4877 


6663 


7 84CIP2B_995 


8654 


1306 


3092 


4878 


6664 


7 84CIP2B_996 


8655 


1307 


3093 


4879 


6665 


7 84CIP2B_997 


8657 


1308 


3094 


4880 


6666 


7 84CIP2B_998 


8665 


1309 


3095 


4881 


6667 


784CIP2B_999 


8668 


1310 


3096 


4882 


6668 


784CIP2B__1000 


8671 


1311 


3097 


4883 


6669 


784CIP2B_1001 


8672 


1312 


3098 


4 8 84 


6670 


784CIP2B_1002 


8692 


1313 


3099 


4885 


6671 


784CIP23_1003 


87C6 


1314 


3100 


4886 


6672 


784CIP23_JL004 


8716 


1315 


3101 


4887 


6673 


784CIP2B_1Q05 


8719 


1316 


3102 


4888 


6674 


784CIP2B_1006 


8743 


1317 


3103 


4889 


6675 


784CIP2B_1007 


8764 


1318 


3104 


4890 


6676 


7B4CIP2B_10G8 


8764 


1319 


3105 


4891 


6577 


784CIP2B_1009 


8764 


1320 


3106 


4892 


6678 


784CIP2B_1010 


8774 


1321 


3107 


4893 


6679 


784CIP2B_1011 


8782 


1322 


3108 


4894 


6680 


784CIP2B_10I2 


8796 


1323 


3109 


4895 


6681 


784CIP2B_1013 


8827 


1324 


3110 


4896 


6682 


784CIP2B_1014 


8842 


1325 


3111 


4897 


6683 


7B4CIP2B_1015 


8842 


1326 


3112 


4898 


6684 


784CIP2B_1016 


8858 


1327 


3113 


4899 


6685 


784CIP2B_1017 


8871 


1328 


3114 


4900 


6686 


784CIP2B_1018 


8921 


1329 


3115 


4901 


6687 


784CIP2B1019 


8927 


1330 


3116 


4902 


6688 


7 84CIP2B_1020 


8942 


! 1331 


3117 


4903 


6689 


784CIP2B_1021 


8994 


1332 


3118 


4904 


£690 


784CIP2B 1022 


9023 


| 1333 


3119 


4905 


6691 


7 84CIP2B_1023 


9028 


1334 


3120 


4906 


6692 


784CIP2B_1024 


9058 


1335 


3121 


4907 


6693 


784CIP2B_1G25 


9058 


1336 


3122 


4908 


6694 


784CIP2B_1026 


9079 


1337 


3123 


4909 


6695 


784CIP2B_1027 


9079 


1338 


3124 


4910 


6696 


784CIP2B_1028 


9082 


1339 


3125 


4911 


6697 


784CIP2B_1029 


9084 


1340 


3126 


4912 


6698 


784CIP2B_JL030 


9093 


1341 


3127 


4913 


6699 


784CIP2B1031 


9101 


1342 


3128 


4914 


6700 


784CIP2B1032 


9103 


1343 


3129 


4915 


6701 


784CIP2B_1033 


9105 


1344 


3130 


4916 


6702 


784CIP2B_1034 


9151 


1345 


3131 


4917 


6703 


784CIP2B__1035 


9161 


1346 


3132 


4918 


6704 


784CIP2B 1036 


9172 


1347 


3133 


4919 


6705 


784CIP2B_1037 


9174 


1348 


3134 


4920 


6706 


784CIP2B 1038 


9204 


1349 


3135 


4921 


6707 


784CIP2B 1039 


9234 


1350 


3136 


4922 


6708 


784CIP2B_1040 


9235 


1351 


3137 


4923 


6709 


784CIP2B_1041 


9239 




3138 


4924 


6710 


784CIP2B_1042 


9256 


1353 


3139 


4925 


6711 


784CIP2B_1043 


9276 


j 1354 


3140 


4926 


6712 


784CIP2B_1044 


9345 


1355 


3141 


4927 


6713 


784CIP2B 1045 


9379 


1356 


3142 


4928 


6714 


784CIP2B 1046 


9435 


1357 


3143 


4929 


6715 


784CIP2B 1047 


9437 


1358 


3144 


4930 


6716 


784CIP2B 1048 


9469 


1359 


3145 


4931 


6717 


784CIP2B_1049 


9500 


1360 


3146 


4932 


6718 


784CIP2B 1050 


9502 


1361 


3147 


4933 


6719 


784CIP2B_1051 


9520 
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BNSDOCID: <WO 0153312A1 J_> 



WO 01/53312 



PCT/US00/34263 



gPO TD NO ♦ 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO : 
of confcig 

sequence 


SEQ ID 

NO ; 

of contig 

f *~ \f A 

sequence 


Priority 
docket number_ 
corresponding 

o z?c\ rn 7\T/~i . _ 

xu wo: in 

jp jl lor icy 


SEQ ID 
NO: in 
U.S. S.N. 
09/488 , 725 


1362 


3148 


4934 


6720 


784CIP2R i rmp 


9541 


1363 


3149 


4935 


6721 






1364 


3150 


4936 


6722 


784CIP2R 1(1^4 


954 8 


1365 


3151 


4937 


6723 


784CIP2B 1055 




J 1366 


3152 


4938 


6724 


7R4CTP2R t i rTc;c 


9556 


1367 


3153 


4939 


6725 


784CIP2R mq7 


9575 


1368 


3154 


4940 


6726 


784CIP2B 1058 


9589 


1369 


3155 


4941 


6727 


f 7fl^rTP9R 1 fJRQ 

' u " \w J. r -L \J Z> 23 


9599 


1370 


3156 


4942 


6728 


' o*i^-XcjiD X U D U 


9602 


1371 


3157 


4943 


6729 


'Ci^x Jcr^xs luol 


9606 


1372 


3158 


4944 


6730 




9622 


1373 


3159 


4945 


6731 


/O^l urrAH lUb J 


9623 


1374 


3160 


4946 


6732 


' o *» J. Ir £D lUbl 


9646 


1375 


3161 


4947 


6733 




974 7 


1376 


3152 


4948 


6734 




9773 


1377 


3163 


4949 


6735 




9785 


1378 


3154 


4950 


6736 


/ © 4 CI P2B 1 06 8 


9801 


1379 


3165 


\ 4951 


673 7 


784 CI P2B 1069 


9811 


1380 


3166 


4952 


673 8 


784CIP2B_1070 


9843 


1381 


3167 


' 4953 


t) »J3 


784CIP2B 1071 


9854 


1382 


3168 


I 4 954 


674 0 ~ 


784CIP2B 1072 


9854 


1383 


3169 


4955 


6741 


784CIP2B 1073 


9864 


1384 


3170 


4956 




784CIP2B_1074 


9864 \ 


1385 


3171 


I 4 tin 


6743 


| 784CIP2B 1075 


9871 


1386 


3172 


4958 


6744 


784CIP2B 1076 


9879 


1387 


3173 


4 959 


674 5 


784CIP2B_1077 


9881 


1388 


3174 


4960 


CIA C 


784CIP2B 1078 


9885 


1389 


1 3175 


4 961 


674 7 


784CIP2B 1079 


9901 


1390 


3 17S 


4 962 


674 8 


784CIP2B 1080 


9912 


1391 


3177 


4 963 




784CIP2B 1081 


9916 


1392 


3178 


4 964 


P/5U 


784CIP2B 1082 


9921 


1393 


3179 


4965 


6751 


784CIP23 1083 


9925 


1394 


3180 


4966 


6752 


784CIP2B 1084 


9930 


1395 


3161 


4 967 


6753 


784CIP2B__1085 


9949 


1396 


3182 


4 968 


6754 


/84CXP2B 1086 


9951 


1397 


3183 


4969 


6755 


1087 


9559 


1398 


3184 


4970 


O / 3D 


784CIP2B 1088 


9973 


1399 


3185 


4 971 


6757 


"7 Q jl ft T UOD "1 #~1 O 


9982 


1400 


3186 


4972 


6758 


/o^i_x±'zis xuyu 


9994 


| 1401 


3187 


4973 


6759 




10021 


1402 


3188 


4974 


5760 


*7 ft 4<"*T T>*>'H 1 nQO 


10041 


1403 


3189 


4975 


6761 




10067 


1404 


3190 


4976 


6762 




10073 


1405 


3191 


4977 


6763 


7H4CTP2R lOQC 


101*12 


1406 


3192 


4978 


6764 




10117 


1407 


3193 


4979 


6765 


784CTP2R inqfl 


10132 


1408 


3194 


4980 


6766 


784CIP2R i nqq 


1 (11 CQ 


1409 


3195 


4981 


6767 


784CIP2R nnn 


10217 


1410 


3196 


4982 


6768 


784CTP2R 1101 


JLUZZb 


1411 


3197 


4983 


6769 


784CIP2R 11(15 

' 0^t>- Kk^O XXUZ 


10232 


1412 


3198 


4984 


6770 


784CIP2R 11(11 


10237 


1413 


3199 


4985 


6771 ■" 


/ o*±\-X¥£.d XXU4 


10279 


1414 


3200 


4986 


6772 


784CIP2C 1 


33 | 


1415 


3201 


4987 


6773 


784CIP2C 2 


271 


1416 


3202 


4988 


6774 


784CIP2C_3 


848 


1417 j 


3203 


4989 


6775 


784CIP2C 4 


849 


1418 


3204 


4990 


6776 


784CIP2C 5 


864 


1419 


320S 


4991 


6777 


784CIP2C 6 


953 


1420 


3206 


4992 . 


6778 


784CIP2C 7 


980 


1421 


3207 


4993 


6779 


784CIP2C 8 


1595 


1422 


3208 


4994 


6780 


784GIP2C 9 


1697 


X*kAA j 3209 


4995 


6781 


784CIP2C_10 


1744 
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BNSDOCID: <WO 0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of conticj 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO : in 
U.S.S -N . 
09/488, 725 


1 1424 


3210 


4996 


6782 


784CIP2C 11 


1937 


142S 


3211 


4997 


6783 


784CIP2C 12 


1955 


1426 


3212 


4998 


6784 


784CIP2C 13 


1955 


1427 


3213 


4999 


6785 


784CIP2C 14 


2185 


1428 


3214 


5000 


6786 


784CIP2C 15 


2889 


1429 


321S 


5001 


6787 


784CIP2C 16 


2901 


1430 


3216 


5002 


6788 


784CIP2C_17 


2902 


1431 


3217 


5003 


6789 


784CIP2C_18 


2905 


1432 


3218 


5004 


6790 


784CIP2C 19 


2948 


1433 


3219 


5005 


6791 


784CIP2C_20 


2956 


1434 


3220 


5006 


6792 


784CIP2C_21 


2959 


143S 


3221 


5007 


6793 


784CIP2C 22 


2965 


1436 


3222 


5008 


6794 


784CIP2C 23 


2966 


1437 


3223 


5009 


6795 


784CIP2C 24 


2970 


1438 


3224 


5010 


6796 


784CIP2C_25 


2985 


1439 


3225 


5011 


6797 


784CIP2C 26 


2987 


1440 


3226 


5012 


6798 


784CIP2C_27 


2993 


1441 


3227 


5013 


6799 


784CIP2C 28 


2993 


1442 


3228 


5014 


6800 


784CIP2C_29 


3017 


1443 


3229 


5015 


6801 


784CIP2C_30 


3046 


1444 


3230 


5016 


6802 


784CIP2C_31 


3050 


1445 


3231 


5017 


6803 


784CIP2C_32 


3357 


1446 


3232 


5018 


6804 


784CIP2CJ33 


3359 


1447 


3233 


5019 


6805 


784CIP2C_34 


3432 


1448 


3234 


5020 


6806 


784CIP2C__35 


3438 


1449 


3235 


5021 


6807 


784CIP2C_36 


3439 


1450 


3236 


5022 


6808 


784CIP2C 39 


3463 


1451 


3237 


5023 


6809 


784CIP2C_40 


3466 


1452 


3238 


5024 


6310 


784CIP2C_41 


3466 


1453 


3239 


5025 


6 311 


784CIP2C_42 


3467 


1454 


3240 


5026 


6312 


784CIP2C_43 


3468 


1455 


3241 


5027 


6813 


784CIP2C_44 


3483 


1456 


3242 


502B 


6 814 


784CIP2C_45 


3484 


1457 


3243 


5029 


6815 


784CIP2C_46 


3486 


1458 


3244 


5030 


6816 


784CIP2C_47 


3491 


1459 


3245 


5031 


6817 


784CIP2C_48 


3493 


1460 


3246 


5032 


6818 


7B4CIP2C_49 


3494 


1461 


3247 


5033 


6819 


784CIP2C 50 


3495 | 


1462 


3248 


5034 


6820 


784CIP2C_51 


3496 


1463 


3249 


5035 


6821 


784CIP2C 52 


3503 


1464 


3250 


5036 


6822 


784CIP2C_53 


3503 


1465 


3251 


503/ 


6823 


784CIP2C_54 


3 504 


1466 


3252 


5038 


6824 


784CIP2C_55 


3511 


1467 


3253 


5039 


6825 


784CIP2C_56 


3531 


1468 


3254 


5040 


6826 


784CIP2C_57 


3536 


1469 


3255 


5041 


6827 


7 84CIP2C_58 


3546 


1470 


3256 


5042 


6828 


7 84CIP2C_59 


3 548 


1471 


3257 


5043 


6829 


7B4CIP2C 60 


3551 


1472 


3258 


5044 


6830 


784CIP2C_61 


3553 


1473 


3259 


5045 


6 831 


784CIP2C_62 


3564 


1474 


3260 


5046 


6832 


784CIP2C 63 


3567 


1475 


3261 


5047 


6033 


784CTP2C_64 


3572 


1476 


3262 


5048 


6834 


I O^UXr^U t>3 


3573 


1477 


3263 


5049 


6835 


784CIP2C_66 


3574 


1478 


3264 


5050 


6836 


784CIP2C 67 


3583 


1479 


3265 


5051 


6837 


784CIP2C_68 


3615 


1480 


3266 


5052 


6838 


784CIP2C_69 


3623 


1481 


3267 


5053 


6839 


784CIP2CJ70 


3629 " 


1482 


3268 


5054 


6840 


784CIP2CJ71 


3666 


1483 


3269 


5055 


6841 


784CIP2C_72 


3667 


1484 


3270 


5056 


6842 


784CIP2C 73 


3906 


1485 


3271 


5057 


6843 


7 84CIP2C_74 


3912 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



PCT/USOU/34263 



SEQ ID NO i 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docke t numbe r__ 
corresponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


1486 


3272 


5058 


6844 


784CIP2C 75 


3924 


1487 


3273 


5059 


6845 


784CIP2C 76 


3928 


1488 


3274 


5060 


6846 


784CIP2C 77 


3935 


1489 


3275 


5061 


6847 


784CIP2C 78 


3959 


1490 


3276 


5062 


6848 


784CIP2C 79 


3981 


1491 


3277 


5063 


6849 


784CIP2C_80 


3989 


1492 


3278 


5064 


6850 


784CIP2C 81 


4295 


1493 


3279 


5065 


6851 


784CIP2C_82 


4300 


1494 


3280 


5066 


6852 


784CIP2C_83 


4360 


1495 


3281 


5067 


6853 


784CIP2C!_84 


4362 


1496 


3282 


5068 


6854 


784CIP2C 85 


4371 


• 1497 


3283 


5069 


6855 


784CIP2C 86 


4373 


1498 


3284 


5070 


6856 


784CIP2C_87 


4376 


1499 


3285 


5071 


6857 


784CIP2C 89 


4378 


1500 


3286 


5072 


6858 


784CIP2C 90 


4382 


1501 


3287 


5073 


6859 


784CIP2C_91 


4409 


■ 1502 


3288 


S074 


6860 


784CIP2C_92 


4421 


1503 


3289 


5075 


6861 


784CIP2C_93 


4421 


1504 


3290 


5076 


6862 


784CIP2C_94 


4426 


1505 


3291 


5077 


6863 


784CIP2C_95 


4430 


1506 


3292 


5078 


6864 


784CIP2C_96 


4435 


1507 


3293 


5079 


6665 


7B4CIP2C_97 


4436 


1508 


3294 


5080 


6866 


784CIP2C 98 


4439 


1509 


3295 


5081 


6867 


784CIP2C_99 


4440 


1510 


3296 


5082 


6868 


784CIP2C 100 


4441 


1511 


3297 


50B3 


6869 


784CIP2C_101 


4442 


1512 


3298 


5084 


6870 


784CIP2C_102 


4455 


1S13 


3299 


5085 


6971 


784CIP2C103 


4462 


1514 


3300 


5086 


6872 


784CIP2C 104 


4466 


1515 


3301 


5087 


6873 


784CIP2C_105 


4469 


1516 


3302 


5088 


6 374 


784CIP2C 106 


4477 


1517 


3303 


5089 


6875 


784CIP2C 107 


4481 


1S18 


3304 


5090 


6376 


784CIP2C_108 


4483 


1519 


3305 


5091 


6877 


784CIP2C_109 


4484 


1520 


3306 


5092 


6878 


784CIP2C_110 


4486 


1521 


3307 


5093 


6879 


784CIP2C 111 


4490 


1522 


3308 


5094 


6880 


784CIP2C_112 


4499 


1523 


3309 


5095 


6881 


784CIP2CJL13 


4503 


1524 


3310 


5096 


6882 


784CIP2C_114 


4506 


1525 


3311 


5097 


6883 


784CIP2C 115 


4509 


1526 


3312 


5098 


6884 


784CIP2CJL16 


4514 


1527 


3313 


5099 


6885 


784CIP2C_117 


4516 


1528 


3314 


5100 


6886 


784CIP2C_118 


4522 


1529 


3315 


5101 


6887 


7B4CIP2CJL19 


4525 


1530 


3316 


5102 


6888 


784CIP2C_120 


4527 


1531 


3317 


5103 


6889 


784CIP2C_121 


4528 


1532 


3318 


5104 


6890 


784CIP2C 122 


4529 


1533 


3319 


5105 


6891 


784CIP2C_123 


4532 


1534 


3320 


5106 


6892 


784CIP2C_124 


4537 


1535 


3321 


5107 


6893 


784CIP2C_125 


4538 


1536 


3322 


5108 


6894 


784CIP2C_126 


4551 


1537 


3323 


5109 


6895 


784CIP2CJL27 


4552 


1C1Q 
J.JJ 0 


3324 


5110 


6896 


784CIP2C 128 


4559 


1539 


" 3325 


5111 


6897 


784CIP2C 129 


4567 


1540 


3326 


5112 


6898 


784CIP2C 130 


4568 


1541 


3327 


5113 


6899 


784CIP2CJL32 


4585 


1542 


3328 


5114 


6900 


784CIP2C 133 


4592 


1543 


3329 


5115 


6901 


784CIP2C 134 


4609 


1544 


3330 


5116 


6902 


784CIP2C 135 


4616 


1545 


3331 


5117 


6903 


784CIP2C 136 


4617 


1546 


3332 


5118 


6904 


784CIP2C 137 


4618 


1547 


3333 


5119 


6905 


784CIP2C 138 


4620 
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BNSDCCJD: <WO 0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



C CO T TV wrs • 
f til 1 - 

length 
nucleot ide 
sequence 


SEQ ID 
NO : of 
full- 
length 
peptide 
sequence 


c-^o TD MO- 

of contig 
nucleot ide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1548 


3334 


5120 


6906 


784CIP2C_139 


4624 


1549 


3335 


5121 


6907 


784CIP2C_140 


4632 


1550 


3336 


5122 


6908 


784CIP2C 141 


4634 


1551 


3337 


5123 


6909 


784CIP2C_142 


4638 


1552 


3338 


5124 


6910 


784CIP2C_143 


4639 


1553 


3339 


5125 


6911 


784CIP2C 144 


4643 


1554 


3340 


5126 


6912 


784CIP2C_145 


4644 


1555 


3341 


5127 


6913 


784CIP2C 146 


4655 


| 1556 


3342 


5128 


6914 


784CIP2C_147 


4668 


1557 


3343 


5129 


6915 


784CIP2C 148 


4677 


1558 


3344 


5130 


6916 


784CIP2C_149 


4677 


1SS9 


3345 


. 5131 


6917 


784CIP2C_150 


4677 


1560 


3346 


5132 


6918 


7 84CIP2C_152 


4682 


1561 


3347 


5133 


6919 


784C1P2C 153 


4690 


1562 


3348 


5134 


6920 


784CIP2C 154 


4691 


1563 


3349 


5135 


6921 


784CIP2CJL55 


4727 


1564 


3350 


5136 


6922 


784CIP2CJL56 


4730 


1565 


3351 


5137 


6923 


784CIP2C_157 


4734 


1566 


3352 


5138 


6924 


784CIP2C 158 


4757 


1567 


3353 


5139 


6925 


784CIP2C_159 


4764 


1568 


3354 


5140 


6926 


784CIP2C_160 


4786 


1569 


3355 


5141 


6927 


784CIP2CJL61 


4793 


1570 


3356 


5142 


6928 


7 84CIP2CJL62 


4825 


1571 


3357 


5143 


6929 


784CIP2C__163 


4826 


1572 


3358 


5144 


6930 


784CIP2C_164 


4850 


1573 


3359 


5145 


6931 


784CIP2C__l65 


4853 


1574 


3360 


5146 


6932 


7 84CIP2C_166 


. 4855 


1575 


3361 


5147 


6933 


784CIP2C__167 


4856 


• 1576 


33 62 


5148 


6934 


784CIP2C_168 


4867 


1577 


3363 


5149 


6935 


784CIP2C__169 


4869 


1578 


3364 


5150 


6936 


784CIP2C_170 


4878 


" 1579 


3365 


5151 


6937 


784C1P2C_JL71 


4880 


1580 


3366 


5152 


6938 


7 84CIP2C_172 


4942 


1581 


3367 


5153 


6939 


784CIP2C_173 


4945 


1582 


3368 


5154 


6940 


784CIP2C_174 


4950 


1583 


3369 


5155 


6941 


784CIP2C_175- 


4952 


1584 


3370 


5156 


6942 


784CIP2C 176 


4954 


1585 


3371 


5157 


6943 


784CIP2CJL77 


4958 


1586 


3372 


5158 


6944 


784CIP2C 178 


4961 


1587 


3373 


5159 


6945 


784CIP2C_179 


5590 


1588 


3374 


5160 


6946 


784CIP2C 180 


5599 


1589 


3375 


5161 


6947 


7B4CIP2C_181 


5692 


1590 


3376 


5162 


6948 


784CIP2C_182 


5732 


1591 


3377 


5163 


6949 


784CIP2CJL83 


5765 


1592 


3378 


5164 


6950 


784CIP2C 184 


5771 


1593 


3379 


5165 


6951 


784CIP2C_185 


5774 


1594 


3380 


5166 


6952 


784CIP2C_186 


5793 


1595 


3381 


5167 


6953 


784CIP2C_1B7 


5806 


1596 


3382 


5168 


6954 


784CIP2C 188 


5852 


1597 


3383 


5169 


6955 


784CIP2C 189 


58 92 


1598 


3384 


5170 


69S6 


784CIP2C_190 


6057 


1599 


3385 


5171 


6957 


784CIP2CJL91 


6061 


1600 


■ ■ 3386 


5172 


69S8 


784CIP2C_192 


6109 


1601 


3387 


5173 


6959 


784CIP2C 193 


6160 


1602 


3388 


5174 


6960 


784CIP2C_194 


6297 


1603 


3389 


5175 


6961 


784CIP2C_195 


6398 


1604 


3390 


5176 


6962 


784CIP2C_196 


6398 


160S 


3391 


5177 


6963 


784CIP2C_197 


" 6415 


1606 


3392 


5178 


6964 


784CIP2C 198 


6448 


1607 


3393 


5179 


6965 


784CIP2C_199 


6469 


1608 


3394 


[ 5180 


6966 


784CIP2C 200 


6476 


1609 


3395 


5181 


6967 


784CIP2C 201 


6561 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docJce t nurnbe r_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO:in 
U.S. S.N. 
09/488, 725 


1610 


3396 


5182 


6968 


784CIP2C_202 


6574 


1611 " 


3397 


5183 


6969 


784CIP2C 203 


6578 


1612 


3398 


5184 


6970 


784CIP2C 204 


6662 


1613 


3399 


5185 


6971 


784CIP2C 205 


6672 


1614 


3400 


5186 


6972 


784CIP2C_206 


6691 


1615 


3401 


5187 


6973 


784CIP2C 207 


6695 


1616 


3402 


5188 


6974 


784CIP2C_208 


6746 


1617 


3403 


5189 


6975 


784CIP2C 209 


6898 


1618 


3404 


5190 


6976 


784CI?2C_210 


6938 


1619 


3405 


5191 


6977 


7B4CIP2C_211 


6943 


1620 


3406 


5192 


697B 


784CIP2C 212 


7110 


1621 


3407 


5193 


6979 


784CIP2C 213 


7200 


1622 


3408 


5194 


6980 


784CI?2C_214 


7212 


1623 


3409 


5195 


6981 


784CI?2C_21S 


7218 


1624 


3410 


5196 


6982 


784CIP2C 216 


7249 


1625 


3411 


5197 


6983 


784CIP2C_217 


7500 


1626 


3412 


5198 


6984 


784CIP2C 218 


7509 


1627 


3413 


5199 


6985 


784CIP2C 219 


7523 


1628 


3414 


5200 


6986 


784CIP2CJ220 


7544 


1629 


3415 


5201 


6987 


784CIP2C 221 


7564 


1630 


3416 


5202 


6988 


7B4CIP2C 222 


756 8 


1631 


3417 


5203 


6989 


784CIP2C_223 


7631 


1632 


3418 


5204 


6990 


784CIP2C_224 


7B13 


1633 


3419 


5205 


6991 


784CIP2C 225 


7831 


1634 


3420 


5206 


6992 


784CIP2C_226 


7843 


1635 


3421 


5207 


6993 


784CIP2C 227 


7907 


1636 


3422 


520 8 


6994 


784CIP2C 228 


7943 


1637 


3 423 


5209 


6995 


784CIP2C_229 


8175 


1638 


3424 


5210 


6996 


784CIP2C 230 


8216 


1639 


3425 


5211 


6997 


784CIP2C 231 


8225 


1640 


3426 


5212 


6998 


784CIP2C 232 


8271 


1641 


3427 


5213 


6999 


784CIP2C 233 


8397 


1642 


3428 


5214 


7000 


784CIP2C_234 


8466 


1 1643 


3429 


5215 


7001 


784CIP2C_235 


8503 


1644 


3430 


5216 


7002 


784CIP2C_236 


8953 


1645 


3431 


5217 


7003 


784CIP2C_237 


9106 


1646 


3432 


5218 


7004 


784CIP2C_23 8 


9139 


1647 


3433 


5219 


7005 


784CIP2C 239 


9555 


1648 


3434 


5220 


7006 


784CIP2C_240 


9650 


1649 


3435 


5221 


7007 


784CIP2C_241 


9889 


1650 


3436 


5222 


7008 


784CIP2C_242 


9933 


1651 


3437 


5223 


7009 


784CIP2C 243 


9953 


1652 


3438 


5224 


7010 


784CIP2C_244 


9981 


1653 


3439 


5225 


7011 


784CIP2D 1 


746 


1654 


3440 


5226 


7012 


784CIP2D 2 


3558 


1655 


3441 


5227 


7013 


784CIP2D_3 


3553 


iccc 


3442 


5228 


7014 


784CIP2D 4 


3633 


1657 


3443 


5229 


7015 


784CIP2D 5 


3653 


1658 


3444 


5230 


7016 


784CIP2D_6 


3732 


1659 


3445 


5231 


7017 


784CIP2D 7 


4004 


1660 


3446 


5232 


7018 j 


784CIP2D 8 


4700 


1661 


3447 { 


5233 


7019 


784CIP2D 9 


4703 


1662 


3448 


qoVd. 


7020 


784CIP2D 10 


4774 


1663 


3449 


5235 


7021 


784CIP2D__11 


4894 


1664 


3450 


- 5236 


7022 


784CIP2D_12 


4918 


1665 


3451 


5237 


7023 


784CIP2D_13 


5159 


1666 


3452 


5238 


7024 


764CIP2D 14 


7443 


1667 


3453 


5239 


7025 


784CIP2D 15 


8673 


1668 


3454 


5240 


7026 


784CIP2D16 


8679 


1669 


3455 r 


5241 


7027 


784CIP2D 17 


8727 


1670 


3456 


5242 


7028 


784CIP2D 18 


8734 


1671 


3457 


5243 


7029 


784CIP2D 19 


8756 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: Of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


cor re spond i ng 


U.S.S .N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1672 


34S8 


5244 


7030 


784CIP2D_20 


8018 


1673 


3459 


5245 


7031 


784CIP2D_21 


8844 


1674 


3460 


5246 


7032 


784CIP2D_22 


8846 


1675 


3461 


5247 


7033 


784CIP2D_23 


8912 


1676 


3462 


5248 


7034 


784CIP2D_24 


8918 


1677 


3463 


5249 


7035 


784CIP2D_25 


8918 


1678 


3464 


5250 


7036 


784CIP2D_26 


8941 


1679 


3465 


5251 


7037 


784CIP2D_27 


8941 


1680 


3466 


5252 


7038 


784CIP2DJ28 


8951 


1681 


3467 


5253 


7039 


784CIP2D_29 


8951 


1682 


3468 


5254 


7040 


7B4CIP2DJ50 


9007 


1683 


3469 


5255 


7041 


784CIP2D_31 


9012 


1684 


3470 


5256 


7042 


784CIP2D32 


9013 


16B5 


3471 


5257 


7043 


784CIP2D_33 


9025 


1686 


3472 


5258 


7044 


784CIP2D_34 


9053 


1687 


3473 


5259 


7045 


7 84CIP2D_35 


9054 


1688 


3474 


5260 


7046 


7B4CIP2D_36 


9054 


1689 


3475 


5261 


7047 


704CIP2D 37 


9113 


1690 


3476 


5262 


7048 


7 84CIP2D_38 


9134 


1691 


3477 


5263 


7049 


784CIP2D_3 9 


9152 


1692 


3478 


5264 


7050 


784CIP2D_40 


9152 


1693 


3479 


5265 


7051 


784CIP2D_41 


9211 


1694 


3480 


5266 


7052 


784CIP2D_42 


9223 


169S 


3481 


5267 


7053 


78 4CIP2D_43 


9223 


1696 


3 4 82. 


5268 


7054 


784CIP2D_44 


9231 


1697 


3483 


5269 


7055 


784CIP2D_45 


9236 


1698 


3 4 84 


5270 


7056 


784CIP2D_46 


9236 


1699 


3485 


5271 


7057 


784CIP2D_47 


9303 


1700 


3486 


5272 


7058 


784CIP2D_48 


93 09 


1701 


3487 


5273 


7059 


784CIP2D_4 9 


9314 


1702 


3488 


5274 


7060 


784CIP2D_50 


9326 


1703 


3489 


5275 


7061 


784CIP2D_51 


9339 


1704 


3490 


5276 


7062 


784CIP2D_52 


9348 


170S 


3491 


5277 


7063 


784CIP2D_53 


9376 


1706 


3492 


5278 


7064 


784CIP2D_54 


9382 


1707 


3 4 93 


5279 


7065 


784CIP2D_55 


9407 


1708 


3494 


5280 


7066 


784CIP2D_56 


9414 


1709 


3495 


5281 


7067 


7B4CIP2D_57 


9439 


1710 


3496 


5282 


7068 


784CIP2D_58 


9485 


1711 


3497 


5283 


7069 


784CIP2D_59 


94 93 


1712 


3498 


5284 


7070 


784CIP2D_60 


9501 


1713 


3499 


5285 


7071 


784CIP2D_61 


9526 


1714 


3500 


5286 


7072 


784CIP2D_62 


9526 


1715 


3501 


5287 


7073 


784CIP2D_63 


9551 


1716 


3502 


5288 


7074 


784CI?2D_64 


9557 


1717 


3503 


5289 


7075 


784CIP2D_65 


9568 


1718 


3504 


5290 


7076 


784CIP2D_66 


9588 


1719 


3505 


5291 


7077 


784CI?2D_67 


9597 


1720 


3506 


5292 


7078 


784CIP2D_68 


9615 


1721 


3507 


5293 


7079 


784CIP2D_69 


9628 


1722 


3508 


5294 


7080 


784CIP2D_70 


9649 


1723 


3509 


S295 


7081 


784CIP2D_71 


9652 


1724 


3510 


5296 


7082 


784CIP2D_72 


9660 


1725 


3511 


5297 


7083 


784CIP2D_73 


9662 


1726 


3512 


5298 


7084 


784CIP2D_74 


9725 


1727 


3513 


5299 


7085 


784CIP2D75 


9746 


1728 


3514 


5300 


7086 


784CIP2D_76 


9777 


1729 


3515 


5301 


7087 


784CIP2D_77 


9787 


1730 


3516 


53 02 


7088 


784CIP2D_78 


9790 


1731 


3517 


5303 


7089 


784CIP2D_79 


9842 


1732 


3518 


5304 


7090 


784CIP2D_80 


9842 


1733 


3519 


5305 


7091 


784CIP2D_81 


9848 
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of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO : of 
full- 
length 
peptide 
sequence 


SEQ ID NO : 
of contig 


SEQ ID 
NO : 

of contig 
peptide 


Priority- 
docket number^ 
corre sponding 
SEQ ID NO: in 

pr jLor i cy 


SEQ ID 
NO: in 
U.S. S.N. 
09/488 , 725 


1734 


3520 


5306 


7092 


781CTP?n fl!> 




1735 


3521 


5307 


7093 


7ftdrTP?n fl"^ 


, 10020 


1736 


3S22 


5308 


7094 


/ 0*4 \— J. r ZLf 0*t 


1001 1 


1737 


3523 


5309 


7095 


7 ftdPT Don RC 


10 052 


1738 


3524 


5310 


7096 


7fl4CTP?n Rfi 


f 


1739 


3525 


5311 


7097 


784CIP2n H7 

' LI ^ X JT ^ 1> O / 


10085 


1740 


3526 


5312 


7098 


' o ^* \- J. r A Lf o? 


10139 


1741 


3527 


5313 


7099 






1742 


3528 


5314 


7100 




1016 5 


1743 


3529 


5315 


7101 




10173 


1744 


3530 


5316 


7102 




10173 


1745 


3531 


5317 


7103 


*7 ft 4 fT D O n Qtr 


10273 


1746 


3532 


5318 


7104 




3121 


1747 


3533 


5319 


7105 




3628 


1748 


3534 


5320 


7106 


/B4CIP2E 4 


3673 


1749 


3535 


5321 


7107 


/a*H_JLi'2r. 3 


4018 


1750 


3 536 


5322 


7108 




4467 


1751 


3537 


5323 


7109 


/84CJLP2E 7 


4 865 


1752 


3538 


5324 


/ J. J. u 


784CIP2E 8 


4916 


1753 


3539 


5325 


It T 1 

/ JL J.X 


784CIP2E 9 


4923 


1754 


3540 


5326 


7112 


784CIP2E 10 


4926 


1755 


3541 


5327 


7113 


784CIP2E 11 


4962 


1756 


3542 


5328 


7114 


784CIP2E 12 


4963 


1757 


3543 




7115 


784CIP2E 13 


4964 


1758 


3544 


533 0 


7116 


784CIP2E_14 


4988 


1759 


3545 


5331 


7117 


784CIP2E 15 


5835 


1760 


3 546 




7118 


784CIP2E_16 


7682 


1761 


3547 


5333 


7119 


■ 784CIP2E 17 


7682 


1762 


3 548 


5334 


7120 


784CIP2E 18 


7699 


1763 


3549 


5335 


7121 


784CIP2E 19 


7707 


1764 


3550 


5336 


7122 


784CIP2E 20 


7707 


1765 


3551 


5337 


7123 


784CIP2E 21 


7752 


1766 


3552 


5338 


7124 


784CIP2E 22 


8357 


1767 


3553 


5339 


7125 


784CIP2E 23 


9065 


1768 


3 554 


5340 




V64CIP2E 24 


9324 


1769 


3555 


5341 


7127 


/B4CIP2F 1 


2976 


1770 


3556 


5342 




7o«lCIP2F 2 


3559 


1771 


3557 


5343 


7129 


Lf^IT J 


4021 


1772 


3558 


5344 


713 0 




4474 


1773 


3559 


5345 


7131 




4566 


1774 


3560 


5346 


7132 




4705 


1775 


3561 


5347 


7133 




4707 


! 1776 


3562 


5348 


7134 




4712 


1777 


3563 


5349 


7135 


784CIP2F 9 


5008 


1778 


3564 


5350 


7136 


784CIP2F 10 


5009 


1779 


3565 


5351 


7137 


784CIP2F 11 


5015 


1780 


3566 \ 


5352 


7138 


7B4CIP2? 12 


5015 


1781 


3567 


5353 


7139 


784CIP2F 13 


7724 


1782 


3568 


5354 


7140 


784CIP2F_14 


7725 


1783 


3569 


5355 


7141 


784CIP2F 15 


8828 


1784 


3570 


5356 


7142 


784CIP2F 16 


8830 


1785 


3571 


5357 


7143 


784CIP2F_17 


9739 


1786 


3572 


535B 


7144 


784CIP2F 18 


9896 



TRAD0CS: 14 16247.1 (%CS70 1 » DOC) 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ftmino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I^Isoleucine , K= Lysine, 
L~Lcucine; M=«Methionine, N=Asparagine , 
P=Proline» Q=Glutamxne, R=Arginine/ 
S=Serine, T~Threonine, V=Valine, 
W=Tryptbphan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5359 


337 


1131 


AHLSARLSALILDEVAILPAPQNI^VLSTNMKHLLMWSPVIAPG 
ETVYYSVEYQGEYESLYTSHIW I PS SWCSLTEGP3CDVTDDITA 
TV P YNLRVRATLGSQTS /CLEHP /VS IPLIETQ PSLPDL/RMEI 
TKDGFHLVIELEDLGPQFEFLVAYWRREPGAEEHVTCMVRSGGIP 
VHLETMEPGAAYv-VKAvfl * V tvrtXoK i j^.^r ay x c,v v^c-"- 1 -* 
VIJU^FAFVGFMLI LVVVPLFVWKMGRLLQ/ YLLLPRGGS SQTPW 
KITQF 


5360 


2 


1115 


PRVRS SGGQEDP ASQQWARPRFTQPSKMRRRVI ARPVGS S VRLK 
CVASGHPRPD I TWMKDDQALTRPEAAEPRKKKWTIiSLKNIiRPED 
SGKYTCRVSNRAGAI NAT YKVDVI QRTRSKPVLTGTHPVNTTVD 
FGGTTSFQCKVRSDVKPVIQWLKRVEYGAEGRHNSTIDVGGQKF 
VVLPTGDVWSRPDGSYLiNKIjLITRARQDDAGMY H-L*UftiN 1 wuxij 
FRSAFLTVLPDPKPPGPPVASSSSATSLPWPWIGIPAGAVFIL 
GTLLLWLCQAQKKPCTPAPAPPLPGHRPPGTARDRSGDKDLPSL 
AALSAGPGVGLCEEHGSPAAPQHLLGPGPVAGPKLYPKLYTGHS 
TPHTYTHPPP S CQLNSSHS 


5361 


3 


925 


"H^S~ISSANII>I J DDQFQPK1,TDFAMAH?RSHLEHQSCTINMTSS 
SS KELW YMPEE Y I RQGKLS I XTDVYS FG I V IME VLTGCRWLDD 
PKHIQLRDIjLRELMEKRGLDSCLSFLDKKVPPCPRNFSAKLFCL 
AGRCAATRAKLRPSMDEVLNTLESTQASLYFAEDPPTSLKSFRC 
PSPLFLENVPS I PVEDDESQNNNLLPSDEGLRIDRMTQKTPFEC 
SQSEVMFLSLBKKPESKRNEEACNMPSSSCEESWFPKYIVPSQD 
LRPYKVNIDPSSEAPGHSCRSRPVESSCSSKFSWDEYEQYKKE 


5362 


2 


4879 


scqvegctrt yns sqs ig khmktahpdq yaafkmqrks kkgqxa 
nniiktpnngkfvyflpspvnssnpfftsqtkangnpacsaqlqh 
vspp i fpahlasvstpllssmesv in pni tsqdkn3qggmlcsq 
menlpstalpaqmedltktvlplnidrgsdpflslpaess s idl 

F PS PADS GTNSVFS QLENNTNHYSSQ I EGNTNS S FLKGGNGENA 
VFPSQVNVANKFSSTNAQQSAPEKVKKDRGRGQTGKERKPKHNK 
RAKKPAI IRDGKF1CSRCYRAFTNPRSLGGHLSKRS YCKPLDGA 
EIAQELLQSNGQPSLIiASMHiSTNAWLQQPQQSTFNPEACFKD 
P S FLQLLAENRS PAFLPNT FPRSGVTN FNTS VSQEGSE 1 1 1 QALi 
ETAGIPSTFEGAEMLSHVSTGCVSDASQVNATVMPNPTVPPLLH 
TVCHPNTIiLTNQWRTSNS KTSS IEECSSLPVFPTNDLLLKTVEN 
GLCSSS FPNSGG PS QNFTSNS SRVS VI SGPQNTRS SHLNKKGNS 
AS KRRKKVAPPLI APNASQNLVTSDLTTMGLI AKSVE I PTTNLH 
SNVIPTCEPQSLVEIHjTQKLNNVNNQLFMTDVKENFKTSLESHT 
VLAPLTLKTENGDS QMMAI*NSCTTSVNSDLQI SEDNVI QNFEKT 
LEI IKTAMNSQ I I*E VKSGSQGAGETSQNAQINYNI QI»P SVNTVQ 
NNKLPDS S P \FSS F 1 SVMPTESN I PQS E\ VSHKEDQ I QEILEGL 
QKI*KLENDLSTPASQCVLIWTSVTLTPTPVKSTADITVIOPVSS 
MINIQFNDKVNKPFVCQNCK3CNYSAMTI03ALFKHYGKIHQYTPE 
MILE I KKNQLKFAP FKC WPTCTKTFTRNSNLRAHCQLVHH FTT 
EEMVKLKIKRPYGRKSQSENVPASRSTQVKKQLAMTEENKKESQ 
PALELRAETQNTHSNVAVI PEKQL I EKKS PDKTES SLQV I TVTS 
EQCNTNALTNTQTKGRKIRRHKKEKEE KKRKKPVSQSLEFPTR Y 
S PYRP YRCVHQG CFAAFTIQQNLl LHYQAVHKSDL PAFS AEVEE 
ESEAGKESEETETKQTLKEFRCQVSDCSR1FQAITGLIQHYMKL 
HEMTPEEI E S MTAS VDVGKFP CDQLECKS S FTTYLWYWHLEAD 
HGIGLRAS KTEEDG VYKCDCEGCDRI YATRSNLLRH I FNKHNDK 
HKAHLIRPRRLTPGQENMSSKANQEKSKSKHRGTKHSRCGKEGI 
KMPKTKRKKKNNLENKNAKIVQIEENKPYSLKRGKHVYSIKARN 
DALSECTSRFVTQYPCMIKGCTSWTSESNI IRH YKCHKLSKAF 
TSQHRNLLIVFKRCCNSQVKETSEQEGAKNDVKDSiyrCVSESND 
KSRTTATVSQKEVEKNE*DEMDELTEDFITKLINEDSTSVETQA 
NTS SNVSNDFQEDNL CQSERQ KASNLKRVNKEKWVS QNKKRK\TE 
KAE PASAAELSSVRKEEETAVAI QTI EEHP AS FDKS S F KPMGFE 
VS FLKFLEESAVKQKKNTDKDH PNTGNKKGSHSNSRKN IDKTAV 
TSGNHVCPCKESETFVQFANPSQLQCSDNVKIVLDKNLKDCTEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I,ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline , Q^Glut amine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y^Tyrcsine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VIiKQLQEMKPTVSLKKLEVHSNDPDMSVMKDIS IGKATGRGQY ' 


5363 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRIjNML 
RGPG PGLLLLAVLCLGTAVPSTG ASKSKRQAQQMVQPQS PVAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMIiECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMfWDCTCLGEGSGR 
I TCTS RNR CNDQDTRTS YRIGDTWS KKDNRGNLLQC I CTGNGRG 
EVJKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DS G WYS VGMQ LA * KTQGNKQML \ CTCLGNGVS CQETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYSFCTMTVIiVQTRGGNSNGALCHFPFLYlWHNYTDCTSEGRR 
DNMKWCGTTQN YDADQKFGFCPMAAHEE I CTTNEGVMYR I GDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTF7IKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQ1 
GDS WE KY VHGVR YQCYC YGRG IGE WHCQ PLOT Y PSS SG P VEVF I 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNS YTI KGLKPG WYEGC.LIS X QQYGHQEVTRFDFTTTSTST 
PVTSNT\VTGETTPFS PLVATSES VTEITAS S FWS WVSASDTV 
SGFRVEYELSEEGDEPQYIjVLPSTATSV\NIP\DLI,PGRKYIVN 
VYQ I S EDGEQS L I LSTSQTTAPDAP PDPTVDQVDDTS I WRWS R 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDIjQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTI MWTP PES AVTG YR VDVI PVN1»PGEHGQRL.PLSRNTF\ AEN 
TGI>S PGVTYY FKV FAVSHGR ES KP1»TAQQTTK1>\ DAPTNXiQF VN 
ETDS T VI»VRW T P PRAQ I TG YRLT VGLTRRGQ PRQYNVG PS VS KY 
PLRNLQPAS E YTVSLVAI KGNQES PKATGVFTTLQPGS S I PPYN 
TEVTETTI VI TWTPAPR I GFKLG VRP3QGGE APREVTSDSGS I V 

vsgltpgve yvytiqvlrdgqerdap \ ivnk\ wtplspptnlh 
leanpdtgvltvswersttpd itgyritttptngqqgksl.eew 
hadqs sct f \ dnle vpgle ynvs vytvkddkes vp i sdt 1 1 pav 
ppptdlrftn / 1 lgpdtmrvtw \ ap p ps i dltnflvrys pvkne 
grmlqsls i fflsdn\awltnllpgteywsvssvyeqhestp 
\lrgrqktgldsp\tgidfs\ditaVnsft\vhw\iapra/tpi 
tgyrir\hhpehf\sgrpredr\vph3rnsitltnltpgteyw 
s ivalngrees pl>i> i gqqstvsdvprdle waatptsll i \swd 
apavtvryyrit ygetggnspvqe ftvpgskstati sglkpgvd 
yt i tvyavtgrgds pas s kpisinyrte i dkpsqmqvtdvqdns 
isvkwlpss sp vtgyrvttta pkkgpg\ptktktagpdqtemti 
eglqptveywsvyaqnpsgesqplvqtavtwidrpkglaftdv 
dvds i kiawes pqgqvsryrvtysspedgihelfpapdgeedta 
elo/3lrpgseywsvvalhddmesqpligtqstaipaptdlkft 
qvtptslsaqwtppnvqltgyrvrvtpkektgpmkeinlapdss 

S VWS GLMVATKYE VS VYALKDTIjTSRPAQG WTTLENVS PPRR 
ARVTDATETTI TIS WRTKTETITG FQVDAVPANGQTPIQRTIKP 
DVRSYTITGLQPGTDYKIYLYTLNDWARSSPWIDASTAIDAPS 
NLRFLxATTPNSLIjVSWQPPRARITGYIIKYEKPGSPPREWPRP 

rpgvteatitglepgteytiyvialknnqksepligrkktdelp 
ql vtl phpnlhg pe ild v pstvq ktp fvthpg ydtgng i qlpgt 
sgqqpsvgqomifeehgfrrttppttatpirhrprpyppnvgqe 

ALSQTTISWAFFQDTSEYIISCHPVGTDEEPIiQFRVPGTSTSAT 

LTGLTRGATYNI I vealkdqqrhkvreewtvgnsvneglnqpt 

DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCIjGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCI'CFGGQRGWRCDNCR 
RPGGEPS PEGTTGQS YNQYSQRYHQRTNTNVNCP IECFMPLDVQ 
ADREDSRE 


5364 


8066 


703 


RLCCTGGGEGT PGASG KRGPAATTS L VLC I PS VP PPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANIjVATCLPVRASLPHRLiNMIi 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor r e spon d i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutarnine, R^Arginine, 
S^Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QSKPGCYDNGKHYQINQQWERTYLGNALVCTCyGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTI ANRCHEGG QS Y K I GDTWRRPHETGG YMLECV CLGNGKG EWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLLQC I CTGNGRG 
EWKCERHTS VQTTS SGSGP FTDVRAAVYQPQPHPQPPP YGHCVT 
DSGWYS VGMQLA* KTQGNKQML\CTCLGNGVS CQETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDX3HLWCSTTSNYEQDQ 
KYSFCTDHTVX.VQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 
DNMKWCGTTQN YDADQ KFG FCPMAAH3E I CTTNEG VM YR IG DQW 
DKQHDMGHMMRCTCVGNGRGE WTC I AYSQLRDQC I VDD I TYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TE T PSQPNSHP I QWN APQP 55 H I S KY I LRW RP KNS VGRW KEAT I P 
GHLNSYTIKGIiKPGWYEGQLISIQQYGHQEVTRFDFTTTSTST 
PVTSNT\VTGETTPFS PLVATSES VTEITASS FWSWVSASDTV 
SG FRVE YEL S EEGDE PQYLVLP STATS V\N I P \ DItLPGRKY I VN 
VYQI S EDGEQSLI LSTSQTTAPDAPPDPTVDQVDDTS I WRWSR 
PQAPITGYRIVYSPSVEGSSTELN1»PETANSVTLSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDIjQFVEVTDV 
KVT I MWT P PES AVTG YR VD VI PVNLPG EHGQRLPL S RNT F \ AEN 
TGI^PGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFW 
ETDSTVLVRWTPPRAQ I TGYRLTVGLTRRGQPRQYMVGPSVS KY 
PLRNLQPASE YTVSLVAI KGNQES PKATGVFTTLQPGSS I PPYN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVEY VYTIQVIiRDGOERDAP \ I VNK\ VVTPI>SPPTNI*H 
LEAN P DTGVIjTVS HERSTTP D I TG YR ITTTPTNGQQGN SLEE W 
HADQS S CTF\ DNLEVPGLE YNVS VYTVKDDKE S VP I SDTI IPAV 
PPPTDLRFTN / 1 LGPDTMRVTW \AP PP S IDLTN FLVRYS P V KNE 
GRMLQSLS I FF1»S DN \A WLTNLLPGTE YWS VSS VYEQHES TP 
\LRGRQKTGLDSP\TGIDFS \ DITA\ WS FT\ VHW\ IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYVV 
SIVALNGREESPLLIGQQSTVSDVPRDLEVVAATPTSIiLl\SWD 
APAVTVRYYR I TYGETGGNS P VQEFT VPGSKS TATI SGLKPGVD 
YTI TV YAVTGRGDS PASS KP I S IN YRTE I DKPS QMQVTDVQDNS 
I SVKWLPS SS P VTG YRVTTTX P KNGPG \ PTKTKTAG PDQTEMTI 
EGLQPTVE YVVS V YAQNP SGESQPLVQTAVTNI DRP KGLAFTDV 
DVDS I KIAWES PQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLR PGS E YTVS WALHDDMESQPL I GTQSTAI PAPTDLKFT 
QVTPTS LS AQWTP PNVQLTG YRVRVT PKEK.TGPMKE INLAPDSS 
SVWSGLMVATKYEVSVYALKDTLTSRPAQGWTTLENVSPPRR 
ARVTDATETTITIS WRTKTETI TGFQVDAVPANGQTP IQRTIKP 
DVRS YT ITGLQPGTDYKI YLYTLNDNARS SPWIDAS TAIDAP S 
NLRFIiATTPNS LLVSWQ P PRARITGY 1 1 KYEKPGS PPREW PR P 
RPGVTEATITGLEPGTEYTI YVIALKNNQKSEPLI GR KKTDELP 
QLVTLPHPNLHG PE I LDVPSTVQKTP FVTHPGYDTGNGIQLPGT 
SGO^PSVGOJ2MIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTT I SWAP FQDTSEY IIS CH P VGTDEEPL<2 FRVPGTSTSAT 
LTGLTRGATYN 1 1 VEAIiKDQQRHKVREE VVTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
S SRWCHDNGVNYKI GE KWDRQGENGQMMS CTCL/3NGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPS PEGTTGQS YNQYSQRYHQRTNTNWCP I ECFMPLDVQ 
ADREDSRE 


5365 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLVJP 
PPS WRRQ P PGG I RRDFS RRLRREANLVATCLP VRAS L PHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQWVQPQSPVAVS 
QSKPGCYDNGKHYQINQQWBRTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANROIEGGKJSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E~ 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
PsProline, Q=Glu t amine , R-Arginine, 
S=Serine, T= Threonine, V»Valine, 
W=Tryptophan, Y= Tyrosine, X Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








I TCTSRNRCNDQDTRTS YRIGDTWS KKDNRGNLLQCI CTGNGRG ' 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA * KTQGNKQMI> \ CTCLGNGVS CQE TAVTQT YG 
GMSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYS FCTDHTVLVQTRGGNSNGAIjCHFPFLYNNHNYTDCTSEGRR 
DNMKWCGTTQN YDADQKFGFCPMAAHEE I CTTNEGVMYR IGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQIiRDQCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGCX5RGRW KCD P VDQCQDSETGTFYQ I 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 

ghlns yti kglxpgwyegqli s i qqyghqe vtrfdftttstst 
pvtsntWtgettpfsplvatsesvteitassfwswvsasdtv 
sgfr ve yelseegdepqylvlpstats v\ni p \ dllpgrkyi vn 
vyqi s edgeqsli lstsqttapdap pdptvdqvddts i wrwsr 
pqap i tgyri vys psvegsstelnlpetans vtl5dlqpgvq yn 
i ti yaveenqes tp wi qqettgt pr sdt vps pr dlq fvevtdv 
kvtimwtppesavtgyrvdvipvnlpgehgqrlplsrntfVaen 

TGLS PGVTY YFKVFAVS HGRES KP LTAQOTTKL\ DAPTNLQFVN 

etdstvlvrwtppraqitgyrltvgltrrgqprqynvgpsvsky 
plrwlqpaseytvszivai kgxqes pkatgvfttlqpgs sippyn 
tevtettivitwtpaprigfeclgvrpsqggeaprevtsdsgsiv 
vsgltpgveyvytlqvlrdgqerdap\lvnk\wtplspptnt,h 

LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEW 
HADQS SCT P \ DN L R V PG LEYMVS VYT VKDDKES VPI SDT 1 1 PA V 
PPPTDLRFTN/ ILGPDTMRVTW\APPPSIDLTNFI>VRYS pvkne 
GRMI.QS LS I F FLS DN\A\TVLTNL>L PGT3 YVVS VS S VYEQHE S TP 
\LRGRQKTGLDSP\TGIDFS\DXTA\NSFT\VHW\IAPRA/TPI 
TGYRI R\HHPEHF \ SGR PREDR\VPHSRNS ITLTNLTPGTEYW 
SIVALNGREESPLIj IGQQSTVSDVPRDLEWAATPTSLL I \SWD 
APAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKPGVD 
YTITV YAVTGRGDS PASS KP I SIN YRTEIDKPSQMQVTDVQDNS 
ISVKWL PSSS PVTG YRVTTT\ PKNGPG\ PTKTKTAGPDQTEMT I 
EGLQ PTVE YWS VYAQN P SGE SQPLVQTAVTN I DR P KGLAFTD V 
DVDS I KIAWESPQGQVSRYRVTYSSP EDGIHELFPAPDGEEDTA 
ELQGLR PGSEYT VS WALHDDMESQPI*I GTQSTAr PAPTDLKFT 
QVTPTSLSAQWTP PNVQLTGYRVRVTPKEKTGPMKEINIAPDS S 
S VWSG IiMVAT KYEVS VYALKDTljTS RPAQGVVTTLENVS P PRR 
ARVTDATETTI T I S WRTKTET I TGFQVDAVP ANGQTP IQRT I KP 
DVRS YT ITGLQPGTDYKI YLYTLNDNARSSPWI DASTAIDAPS 
NLRFLATTPNSLLVSWQPPRARITGYI I KYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTT 1 3 WAPFQDTSE YI I $ CHP VGTDEE PLQFR VPGTSTS AT 
LTGLTRGATYNI I VEALKDQQRHKVREEWTVGNSVNEGLNQPT 
DDSCFDPYTVSKYAVGDEWERMSESGFKI/IjCQCLGFGSGHFRCD 
S S RWCHDNG VN Y KI GE KWDRQG ENGQMMSCTCLGNGKGE FKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPS PEGTTGQS YNQYSQRYHQRTNTNVNCP I ECFMPLDVQ 
ADREDSRB 


S366 


8066 


703 


RLCCTGGGEGTPGAS GKRGPAATTS bVLC I PSVPPPVPFPTLWP " 
tre a wicKWr'f L>LiiKKt>l? i>KRLRREANIiVATCLPVRASIjPHRtjNML 
RGPGPGLLLIiAVIiCLGTAVPSTGAS KS KRQAQQMVQPQS P VAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRI S 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKP I AEKCFDHAAGTS YWGETWEK P YQG WMMVDCTCLG EGSGR 
I TCTSRNRCNDQDTRTS YRIGDTWS KKDNRGNXjICCI CTGNGRG 
E WKCERHTS VQTTS SGSG PFTDVRAAVYQPQ PH PQPP P YGHCVT 
DSGWYSVGMQLA* KTQGNKQML \ CTCLGNGVS CQETAVTQTYG 
GNSI^EPCVLPFTYNGRTFYSCITEGRODGMLWCSTTSIJYEQDQ 
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SEQ 
ID 

.MO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
Ii=Leucine , M=Methionine, N=Asparagine , 
P=rProline f Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KYSFCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 
DNMKWCGTTQNYT3AIX3KFGFCPMAAHEEICTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHP I QWNAPQPSH ISKYI LRWRPKNSVGRWKE AT I P 
GHLNSYTI KGLKPGWYEGQLIS IQQYGHQEVTRFDFTTTSTST 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSA5DTV 
SGFRVEYELSEEGDEPQYI>VLPSTATSV\NIP\DLLPGRKYIVN 
VYQ ISEDGEQSI.ILSTSQTTAPDAPPDPTVDQVDDTS I WRWSR 
PQAP I TG YR I V YS PS VEGS STELNLP ETANS VTLSDLQ PG VQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDIjQFVEVTDV 

kvtimwtppesavtgyrvdvipvnlpgehgqrlplsrntfNaen 
tglspgvtyyfkvfavshgreskpx>taqqttkl\daptnlqfvn 

ETDSTVXiVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 
PLRNLQPASE YT VSLVAI KGNQES PKATGVFTTLQPGSS I PPYN 
TEVTETTIVIWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVE YVYTIQVLRDGQERDAP \ I VNK\ WTPLS PPTNIiH 
LEANP DTGVIjTVS WERSTTPD ITG YR I TTTPTNGQQGNSLiEE W 
t IADQS S CTF\ DNLE VPGLEYTTVS VYTVKDDKES VP I SDT 1 1 PAV 
PP PTDIiRFTN / 1 LGPDTMRVTW \ AP PP 5 1 DLTN FL»VR Y S P VKNE 
GRMLQSLS I FFIiSDN\AWLTNI*LPGTEY WSVSS V YEQHESTP 
\LRGRQKTGLDS P\TGIDFS\DITA\NSFT\ VHW \ I APRA/TP I 
TG YR I R\HHPEH F \SGRPREDR\ VPHS RNS I TLTN L.T PGTE YW 
SIVALNGREESPLIiIGQQ STVSDVPRDLEVVAATP TSLL I \ S WD 
APAVTVRYYR I T YGETGGNS P VQEFTVPGS KSTAT ISGLKPG VD 
YTITVYAVTGRGDS PASS KP I S IN YRTE I DKP S QMQVTDVQDNS 
ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 
EGLQPTVEYWSVYAQNPSGESQPWQTAVTNIDRPKGIAFTDV 
DVDSIKIAWESPCX3QVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGXiR PGSE YT VS WALHDDMES QP1» IGTQSTA I PAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SVWSGLMVATKYEVSVYALXDTLTS RPAQGWTTLENVSPPRR 
ARVTDATETT I TI SWRTKTETITGFQVDAVPANGQTPIQRTI KP 
DVRS YT I TGLQPGTDYKI YI»YT1»NDNARSS P WI DASTAI DAPS 
NLRFLATTPNSLIrVSWQPPRARITGYI I KYEKPGSPPRE WPRP 
RPGVTEATITGIiEPGTE^rriYVIALKNNQKSEPLIGRKKTDELP 
QLVTIiPHPNLHGPE X LDVP ST VQKTP FVTHPG YDTGNG IQLPGT 
SGQQPS VGQQM I FEEHGFRRTTPPTTATP IRHR PR P YP PNVGQE 
ALSQTTIS WAPFQDTSEYI IS CHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYNI IVEALKDQQRHKVREEWTVGNSVNEGLNQPT 
DDSCFDPYTVSKYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
SSRWCHDNGVN Y KIGE KWDRQGENGQMMSCTCIX5NGKGE FKCDP 
HEATCYDDGKTYHVGEQWQKEYIjGAICSCTCFGGQRGWRCDNCR 
RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 
ADREDSRE 


5367 


235 


3591 


KKILNMLCKKNI VIEYIAD ILYEYLYG FCFSGIKKYLI IHVLRL 
ILELWMTRLLLEKSVSLQTQYLLIjIVKILSWFPGKEMRHHLQIM 
EVMMRKQDS/RIVGNGSEQQLQKEIiADVLMDPPMDDQPGEKELV 
KRSQLDGEGDGPLSNQLSASSTINPVPLVGLOKPEMSIjPVKPGQ 
GDSEASSPFTPVADEDSWFSKLTYIjGCASVNAPRSEVEALRMM 
S ILRSQCQI SLDVTLSVPNVSEGIVRLLDPQTNTE IANYPI YKI 
LFCVRGHDGTPESDCFAFTESHYNAELFRIHVFRCEIQEAVSRI 
LYSFATAFRRSAKQTPLSATAAPQTPDSDIFTFSVSLEIKEDDG 
KGYFSAVPKDKDRQCFKLRQGIDKKIVI YVQQTTNKELAIERCF 
GLLLSPGKDVRNSDMHI^DLESMGKSSDGKSYVITGSWNPKSPH 
FQVVNEETPKDKVIiFMTTAVDLVITEVOEPVRFlrLETKVRVCSP 
NERLFWPFSKRSTTENFFDKLKQI KQRERKNNTDTIiYEWCLBS 
ESERERRKTTASPSVRLPQSGSQSSVIPSPPEDDEEEDNDEPLI* 
SGSGDVSKECAEKII^TWGEI,LSKMH3^NLNVRPKQLSSLVP^GV 
PEALRGEVWQLLAGCHNNDHLVEKYRII»ITKESPQDSAITRDIN 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

co rr e spon ding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine , D=Aspartic Acid, E~ 
Glutamic Acid, F=Phenylalonine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine. R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTFPAHDY FKDTGGDGQDSLYKI CKAYSVYDEEIGYCQGQSFIiA 
AVLLLHMPEEQAFSVLVKIMFDYGLREbFKQNFEDLHCKFYQLE 
RLMQE YI PDL YNHFI*D IS LEAHM YASQW FIiTLFTAKFPIj YMVFH 
1 1 DLLLCEG I S VI FNVALGLLr KTS KDDLU iTDFEGAL K F FR VQI> 
PKR YRSEENAK K LM E LACNMK I SQKKLKKY EKE YHTMREQQAQQ 
EDP I ERFERENRRLQEANMRDEQENDDLAHELVTS KIALR KDLD 
NAE E KAD ALN K3 LLMT KQ KLI DAEE EKRRLEEES AHL KKM CRR E 
LDKAESEIKKNSSIIGDYKQICSQLSERLEKQQTANKVEIEKIR 
QKVDDCERCRE FFNKEGRVKG I SSTKEVLDEDTDEEKETLKNQL 
REMELELAQT KI* \Q L VE A SCK I QD \ LEHP F * GLPFNE \ VQ AA\ K 
KTWFNRTLS S I KTATGVQG KETC 


5368 


573 


2014 


GAAAGAADP R RG S L/3GRTM LDFA I FAVT FLUVLVGAVL YL YPAS 
RQAAGIPGITPTEEKDGNI^PDIVNSGSLHEFLVNLHERYGPVVS 
FWFGRRL WSLGTVDVL KQHINPNKTLD/l>F *NHAEVI I KVS I W 
WWQCE * KP \ QRKKLYENGVTDSIrKSNFALIiLXIjPEBLIiDKWZlfcSY 
PETQH\VPLSQHMLGFAMKSVTQMVMGSTFEDDQEVIRFQKNHG 
TVWSE IGKG FLDGSLDKNMTR KKQY EDALMQLES VLRN 1 1 KERK 
GRNFSQHIFIDSIiVQGNLNDQQIIiEDSMI FSLASCI ITAKLCTW 
AI WFLTTSEE VQKKL YEE I NQVFGNGPVTPEKI EQLR YCQHVLC 
ETVRTAKLTPVSAQLQDI EGKIDRFI IPRETLVLYALG WLQDP 
NTWPS PHKFDPDRFDDELVMKTFSS LGFSGTQECPELRFAYMVT 
TVI.LS VLVKRLHLLSVEGQVI ETKYELVTS SREEAWI TVS KRY 


5369 


1 


6622 


PRSLCFSLWAEAAVLiADGGLRRRRRLLRGTMSASFVPNGASLED 
CHCNLFCLADLTG IKWKKYVWQG PTSAPILFPVTEEDPILSSFS 
RCLKADVLG/ VWRRDQRPERRE \L* I FWGGEDP \ VLLTI.FTMTY 
QKKKMECGRMDFPMNAVLCFSKAVHNLLERCLMNRNFVRIGKWF 
VKPYEKDEKPINKSEHLSCSFTFFLHGDSNVCTSVEINQHQPVY 
LLSEEHITLAQQSNS P FQVI CCPFGLNGTLTGQAFKMSDSATKK 
LIGEWKQFYPISCCLKEMSEEKQEDMDWEDDSliAAVEVLVAGVR 
MI Y PACFVLVPQSDI PTPS P VGSTHCS SSCLGVHQVPASTRDPA 
MSSVTLTPPTS PEEVQTVDPQS VQKWVKFS S VSDGFNSDSTSHH 
GGKI PRKLANHVVDRVWQECNMNRAQNKRKYSASSoGI*CEEATA 
AKVASWDFVEATQRTNCS CItRHKNLKSRNAGQQGQAPSLGQQQQ 
ILPKHKTNEKQEKSEKPQKR PLTP FHHRVS VSDDVGMD\ADS \A 
SQRLV\ ISAP\ DSQ\ VRFSNI R\TNDVAK\TPQMHGTEMANS PQ 
PP PLS P \HPCD WDEGVTKT PST PQ S QHFYQMPTPDPLVP S KPM 
EDR IDS LS QS FP PQ YQEAVE PTVYVGTAVNLEEDEAN IAWKYYK 
FPKKKDVEFLPPQLPSDKFKDDPVGPFGQESVTSVTELMVQCKK 
PLKVSDELVQQ YQ I KNQCI»SAIASDAEQEPKIDP YAFVEGDEE F 
I*FPDKKDRQNS ERi£ AGKKHKVEDGTSS VTVLSHEEDAMSIiFSPS 
I KQDAPRPTSHAR P PSTS L I YDSDLAVS YTDLDN L FNSDEDELT 
PGS KRSANGSDDKAS CKE S KTGNLDPLSCISTADLHKMYPTPPS 
LEQKI MGFSPMNMNNKEYGSMDTTPGGTVIiEGNSSSIGAQFKI E 
VDEGFCSPKPSE I KDFS YVYKPENCQ I L VGCSMFAPLKTLPSQY 
LPLIKLPEECIYRQSWTVGKLELLSSGPSMPFIKEGDGSNMDQE 
YGTAYTPQTHTS CGMPPSSAPPSNSGAG ILPSPSTPRFPT PRTP 
RTPRTPRGAGGPAS AQGS VKYENSDLYS PASTPS TCRPLNS VE P 
ATV PS I PEAHSL YVNLI LS ES VMNLFKD CNS DS CC I CVCNMN I K 
GADVGV Y I P DPTQEAQ YRCTCG FSAVMNRKFGNNSGLFFEDEIiD 
1 1 GRNTDCGKEAEKR FEAX.RATS AEHVNGGIiKE S EKLSDDh I LL 
LQ DQ CTNLFSFFGAADQDPFP KSGVI SNWVRVEERDCCNDCYIiA 
I*EHGRQFMDNMSGGKVDEAI»VKSSCLHPWSKRNDVSMQCSQDIL 
RMLLSLQP VLQDAI QKKRT VR P WGVQG PLTWQQFHW4AGRGS YG 
TDESPEPLPIPTFIiLGYDYDYI,VI>SPFALPYWERLt4LEPYGSQR 
D I AYWLCPENEALlfNGAKS Y FRDLTAI Y ESCR LGQHR PVS R L L 
TDGIMRVGSTASKja,SEKLVAEWFSOAADGNNEAFSiCLKLYAQV 
OTYDLGPYIASLPJLDSSLLSQPNLVAPTSQSLITPPQMTNTGNA 
NTPSATLASAAS STMTVTSGVAI STSVATANSTLTTASTSS SS S 
SNLNSGVSSNKLPSFPPFGSMNSNAAGSMSTQANTVQSGQLGGQ 
QTSAIjQTAG I SG ESS S LPTQPHPD VSES TMDRDKVG I PTDGDSH 
A VTYP PA I WYI I DP FT YENTDES TNS S S VWTLG JjLR CFLEMVQ 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=l»eucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T^Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLPPHIKSTVS VQI I PCQYLLQPVKHEDRE I YPQHLKSIiAFSAP 
TAr*PT? PT , PT <;TW\TKTT .TfiFfiPGt. AMETAIiP S PDRPEC I RIjYAP P 
FILAPVKDKQTELGETFGEAGQKYNVLFVGYCLSHDQRWIIiASC 
TDLYGEIjIjETCIINIDVPNRARRKKSSARKFGIiQKLWEWCLGLV 
QMSSLPWRWIGRLGRIGHGELKDWSCLLSRRNLQSLSKRLKDM 
CRMCG1 SAADS PS ILSACLVAMEPQGSFV1MPDSVSTGS VFGRS 
TTIjHMQTS QLNT PQDTS CTHILVFPTS AS VQVASATYTTENLiDIj 
AFNPNNDGADGMGIFDI,IiDTGDDLDPDI IN I LPAS PTGSPVHS P 

AKAGPLPDWFWSACPQAQYQCPLFLKASbHLHVPSVQSDELLHS 
KH SH PLDSNQTS DVLRFVliEQYNALS WLTCDPATQDRRS CIiPIH 
FWL»NQTjY N FI MNMIi 


5370 


1226 


716 


RW S RKLELRRAAQATE S RP PQS Q EMHPPTGKE VHALKRIiRDS AN 
ANDVETVQQLLEDGADPCAADDKGRTALHFAS CNGNDQ I VQIiLL 
DHGADPNQRDGLGNTPIiHIiAACTNHVPVITTIjLRGGARVDAIJDR 
AGRT PLHLAKSKLN ILQEGHAQCLKAVR / HGGEADHP YAEG VSG 
APRAT* AARCSGVFPS PSRWLGSAPWSRSSCTI WSLPLHEAKCR 
AVR PLS SAAQG S APS SSS CCTVST SIJVLAES LSLFRACTS I>PVG 
GCISWL 


5371 


1331 


167 


IAAML.WKLLLRSQSCRLCS FRKMRS P PKYRPFLACFTYTTDKCS 
SKENTRTVE KL YKCS VD I RKIRR\ * KDGYF* RMKPMLKKLRI / F 
LQELGADETAVAS I LERCPE Al VCS PTAVNTQRKLWQLVCKNEE 
EHKl»IEQFPESFFTIKDQSNQKI»NVQFFQELGLKNWISRt*LT 
AAPNVFHNPVEKNKQMWII^ESYLDVGGSEANMKVWLLKLLSQ 
NP F 1 1»LNS PTAI KETIiEFLQEQGFTS FE I LQIiLSKLKGFLFQLC 
PRS IQNS 1SFS KNAF KCTDHDLKQIiVLKCPAliL Y YSVPVLEERM 
QGLLREGI S I AQI RETPMVI>ELTPQIVQYRI RKLNSSGYR I KDG 
HLANLNGSKKEFEANFGKIQAKKVRPLFNPVAPLNVEE 


5372 


51 


857 


SPGAQFLWAAPDMPDPIiFSAVQGKDEIIiHKAIiCFCPWLGKGGME 
PLRLLIIjLFVTELSGAHNTTVFQGVAGQSLQVSCPYDSMKHWGR 
RKAW CRQLGEKG PCQRWS THNLV/LLS FLRRWNG STAITDDTLG 
GTLT I TLRNLQP HDAGI»YQCQSI*HGS E ADTI*RKVLVEVLADPIjD 
HRDAGDLWFPG\DLRASRM?MWSTAS?GASWKEK3PSHPI*PSFS 
SW PAS FSSRF* QPAPSGLQPGMDRSQGHIHPVNWTVAMTOG ISS 
KLCQG 


5373 


2814 


346 


VKKTKS I FNS AMQEMEVY VENIRRKFGVFNYSPFRTPYTPNSQY 
QML1J&PTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPV1.SGG 
TGRR I SLSDMPRS PMSTNS SVHTGSDVEQDAEKKATSSHFS ASE 
ESMDFLDKSTAS PASTKTGQAGSLSGSPKPFS PQI»SAP ITTKTD 
KTSTTGS ILNLWLDRSKAEMDLICELSESVQQQSTPVPI.IS PKRQ 
IRSRFQLNLDKT IESCKAQLGINE I S EDVYTAVEHSDSEDS EKS 
DSSDSEYISDDEQKS*GTSOEDTEDKEGCQMDKEPSAVKKKPKP 
TNPVEIKEELKSTSPASEKADPGAVKDKASPEPEKDFSGKAKPS 
PHP I KDKLiKG KUETDS PI V H LioXj D b JJo e» \i>1 IS Jj V x L) JL/ac. Utto vsrtc. 
GRKNKKEP KE PS PKQDVVGKTPPSTTVGSHSP PETPVIiTRSS AQ 
TSAAGATATTSTSSTVTVTAPAPAATGSPVKKQRPIiLPKE\TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITLVTSTQSSPLVTSSGSM 
STLVS S VNGDI»P IGTASADVAADIAKYTS KL\ MDAIKGTM\TEI 
YKDLS KN\TTWKAQLAEDSQGLRI E I EKLQWLHQQElASEMKHN 
LELTMAEMRQSWEQERDRLIAEVKKQLELEKQQAVDETKiCKQWC 
»NPIf JfRAT F YCfWNTS YCDYPCO\ OAHWPEH\MKS CTOSATAPO 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKS KE SGSTLDX>SGSRETPSS I LLGSNQGSDHSR\ SNKSS WS SS 
DEKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 


5374 


2814 


346 


VKKTKS I FNS AMQEME VYVEN IRRKFGVFN YS PFRTPYTPNSQY 
QMLLD P TN P SAG TAK I DKQE K VKLN FDMTAS P K I LMS KP VLS GG 
TGRRI SLSDMPRS PMSTNS SVHTGSDVEQI2AEKKATSSHFSASE 
ESMDFLDKSTASPASTKTGQAGSIiSGSPKPFSPQIiSAPITTKTD 
KTSTTGS ILNLNIiDRSKAEMDLKELSES VQQQSTP VPLISPKRQ 
I RSRFQLNLDIC? IESCKAQLGINE I S EDVYTAVEHS DSEDS EKS 
1 DSS DS E YI S DDEQKS *GTSQE DTEDKEGCQMDKEPS AVKKKPKP 



306 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(Aa Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine , M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T= Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








TNPVEI KEELKSTSPAS EKADPGAVKDKAS PEPEKDFSG KAKPS 
PH P I KDKLKGKDETDS PTVHLGLDSDSE\NELVI DliGEDHSGRE 
GRKNKKEPKEPS PKQDWGKTPPSTTVGSHS P PETP VLTRSSAQ 
TSAAGATATTSTSSTVT VTAPAPAATGS P VKKQRPLLPKE \TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITLVTSTQSSPLVTSSGSM 
STLVS S VNGDL P IGTAS ADVAAD I AKYTS KI>\ MDAI KGTM \ TE I 
YNDLSKN\TTWKAQLAEDSQGI^IEIEKLQWLHTOEL\SEMKHN 
LELTMAEMRQSWEQERDRLIAEVKKQLELEKQQAVDETKKKQWC 
ANFKKEAIpyCCWNTSYCDyPCQ\QAHWPEH\MKSCTQSATAPQ 
\QEADAE \ VNTETLNKS SQGS S S S TQS AP S ETAS A\ S KE KETSA 
EKSKESGSTLDLSGSRETPSSILIiGSNQGSDHSR\SNKSSWSSS 
DEKRGS \TRSDHN/ TPS TQHGRSLLPGKESRAGTPFLGTS K 


5375 


2907 


1116 


HIFIiAEEEPMLERRCRGPLAMGPAQPRLLSGPSQESPQTljGKES 
RGLRQQGTSVAXQSGAQAPGRAHRCAHCRRHFPGWVAXLWLHTR 
rcqa/rglpi, pcpecgrr frhap FLAJLHRQVKAAATPDWGFACH 

hCGQS FRGWVALVLHLRAHSAAKAGPPACPKMARDAFWRRKAAS 
SS I LRRCHPSRPRGPRPFI CGNCGRS I LPTWDQ/IjKVAHKRVHV 
SRRP*ERGPPAKVFKGPRPRGPPTGDTPPGPGGDAVDRPF\QCA 

ccgkrfrhk\pnlirshaactsgerphq/csrecg\krftwkpy 

LTSXHRRITHTARQPYPCKECGRRFRHKPNLLSHSKIHKRSEGS 
AQAAPGPGSPQLPAGPQESAAEPTPAVPLKPAQEPPPGAPPEHP 

qdp ieappsi, ys cddcgrs frlerflrahqrqhtgerp ftcaec 
gknfgkkthlvahsrvhsgerpfrlarkcgrrflprasqsggrn 
saepnaprfgpfvcpdcgkafrhkpylaahrpiatpaekpyvcp 
dcrkafsqksni>\vshrrihtgerpyacpdcdrsfsqksni,ith 

RKSHIRDGAFCCAICGQTFDDEERIjIiAHQKKHDV 


5376 


4504 


591 


VST fs lclwpaggggrgrvsnmaqskrhvysrtpsgsrmsaeas 

ARPLRVGSR VEVI GKGHRGTVAY VGATLFATGKWVGVI LDEAKG 
KNDGTVQGRKYFTCDEGHGIFVRQSQIQVFEDGADTTSPETPDS 
SAS KVLKREGTDTTAKTS K1>RGL>KP KKAPTAR KTTTRRP KPTRP 
ASTG VAGAS S S LG PSGSAS AGELS S SE PSTPAQTPLAAP I 1 PTP 
VLTS PGAVP P LiPS P S KEEEGLRAQ VRDLEE KLETLRLKRAE DKA 
KLKEL EKHKT QLEQVQE WKS KMQEQQADL»QRRLKEARKEAKEAI» 
EAKERYMEEMAITTADAIEMATLDKEMAEERAESliCX^EVEiAliKER 
VDELTTDLEILKAEIEEKGSDGAASSYQLKQLEEQNARLKDAI>V 
RMRDLSSSEKQEHVK\LQKLMEKKNQELEVVRQQRERLQEELSQ 
AESTIDELKEQVDAAI^AEEMVEMLTDRNl^LEEKVREI^RETVG 
DLEAMNEMNDE LQ EN AR BTELELREQLDMAGARVREAQKR VEAA 
QETVADYQQTIKKYRQLTAHLQDVNRELTNQQEASVERQQQPPP 
ETFDFKIKFAETKAHAKAIEMELRQMEVAQANRHMSLLTAFMPD 
SFLRPGGDHDCVLVLLIiMPRLICKAELIRKQAQEKFEIiSEKCSE 
RPGIiRGAAGEQLSFAAIGI>VY\SLMPAAGHRYHRY*CHALSQCR 
LD \VY KKVGSLY P EMSAH ERS LDFL I EXjLHKDQLDETVNVE PI*T 
KAIKYYQHLYS IHJLAEQPEDCTMQLADHI KFTQS AI,DCMS VEVG 
RLRAFI*QGGQEATDIAIiLLRDLETSCS\DIRQFCKXIRRRMPGT 
DAPG1 PAALAFG PQVSDTLLD CRKHLTWWAVLQE VAAAAAQI.I 
APLAENEGLLVAALEELAFKASEQIYGTPSSSPYECIiRQSCNII* 
ISTMNK\LVTAMQEGE YDAERPPSKP PP \VELRAAALRAE I TDA 
EGIXSLIOiEDRBTVIKELKKSLKIKGEELSEANVRliTDLEKKLDS 
AAKDADERIEKVQTRLEETQALLRKKEKEFEETMDAbQADIDQI* 
EAEKAELKQRLNSQS KRT I EG LRGPPPSG I ATLVSG I AGEEQQR 
GAIPGQAPGSVPGPGLVKDSPLLLQQISAMRIjHISQLQHENSII* 
KGAQMKASLASLPPLHVAKLSHEGPGSELPAGALYRKTSQLLET 
IjNQLS THTHWD I TRTS PAAKS PS AQLMEQ VAQI»KS LSDTVE KI» 
KDEVIiKETVSQRPGATVPTDFATFPSSAFLRAKEEQQDDTVYMG 
KVTFSCAAGFGQRHRLVLTQEQIiHQIjHSRLIS 


5377 


762 


1106 


dvpckrvlpaeaqekgqltlscgesgeeg\f*yhevrqaeges* 
/wfgpnvrlvhtqlktkjcpsgti^kfylhtgstkj-aarisctx 
ss * wpg ydgwwggqyi fi frgmrweeqp 


5373 


2009 


664 


QASGTTLRPLPDLPQLKRREATSRNRALKPRGRiVLMTSCIiPAL 
RFIATPRLSAMPHIDNDVKLDFKDVLLRPKRSTLKSRSEVDLTR 
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IB 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{AsrAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Histidine, I=*Isoleucine, K= Lysine, 
L^Leucine, M=*Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, Mhreonine, V=« Valine, 
W=Tryptophan, Y=Tyro3ine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






1 


SFS FRNSKQTYSGVP 1 I AANMDTVGTFEMAKVLCKS * VPGSFWJD 
VPQMGCVFLI YKLFTLKWKMLLLS VLLPAS I LVAEKFS LFTAVH 
KH YS LVQWQEFAGQNPD CLEHLAAS SGTGS SDFEQLEQI LEAI P 
QVK YI CLD VANGY SEH F VEFVKD VRKR FPQHT IMAGNVVTGEM V 
EELILSGADI I KVG 1 G PGSVCTTRKKTGVGY PQLSAVMECADAA 
HGLKGHI I S DGGCS C PGD VAKAFG AGADFVMLGGMLAGHSESGG 
ELI ERDGKKYKLFYGMS S * I \ AM \ KKYAGGVAE YRASEG KTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKELSRRTTFIRVTQQ 
VNPIFSEAC 


5379 


2009 


664 


Q ASGTTLR P L P DLPQLKRREATS RNRALKPRGRLVLMTS CLPAL 
RP IATPRLSAMPHI DNDVKLDFKDVLLRPKRSTLKSRSEVDLTR 
SFS FRNSKQTYSGVP 1 1 AANMDTVGTFEMAKVLCKS * VPGSFWD 
VPQMGCVFL I YKLFTLKWKMLLLSVLLPAS I LVAEKFS L FTAVH 
KH Y SLVQWQE FAGQNPDCLEHLAAS SGTGS S DFEQLEQI LEAI P 
QVKYI CLDVANGYS EHFVBFVKDVRKRFPQHTIMAGNVVTGEMV 
EELILSGADI I KVG IGPGSVCTTRKKTG VG Y PQLSAVMECADAA 
HGLKGHI I SDGGCS C PGDVAKAFG AGADFVMLGGMLAGHS ESGG 
ELIERDGKKYKLFYGMSS * I\AM\KKYAGGVAEYRASEGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKELSRRTTFIRVTQQ 
VNPIFSEAC 


5380 


2 


2050 


PS RAGG AERGRAAAARS PGGSAAGWECPSVLDEAGACTMSSCVS 
SQ PS SNRAAPQ DELGG RGSSS S ESQKPCEALRG LSS LS I HLGME 
S P I WTECE PGCAVDLGLARDRP LE ADGQEVPLDTSGSQARPHL 
SGRKLSLQERS QGGLAAGGSLDMNGRCI CPSLP YSPVSSPQSS P 
RLPRRPTVESHHVSITGMQDCVQLNQYTLKDEIGKGSYGWKLA 
YNENDmTYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGP I\ EQVYQE IA\ ILKKLDHPNW\KLVEVL\DDPNEDHLYMV 
F \ ELVNQG P VMEVPTLKP LSEDQARF YFQDL I KGIEYLHYQKI I 
H\RDIKPSNLLVGEDGHIKIADFGVSNEFKGSDALLSNTVGTPA 
FMAPESLSETRKIFSGKALDVWAMGVTLYCFVFG*CPFMDERIM 
CLHSKIKSQALEFPDQPDIAEDLKDLITRMLDKNPESRIWPEI 
KLHPWVTRHGA2 PLPSEDSNCTLVEVTEEEVENSVKHI PSLATV 
ILVKTMIRKRSFGNPFEGSRREERSLSAPGNLLTKKPTRECESL 
S ELKT * KI S P L PACCKVT * EFPH P SGCR PSCWQP PFLHTHSQPR 
♦PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSHTM 


5381 


2 


2050 


PSRAGGAERGRAAAARSPGGSAAGWECPSVLDEAGACTMSSCVS 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGMB 
SF IWTECEPGCAVDLGLARDRPLEADGQEVPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLiDMNGRCI CPSLP YSPVSSPQSSP 
RL P RR PTVE S HHVS I TG MQDCVQLNQ VTLKD E IGKGS YGWKLA 
YNEKDOTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGP I \ EQVYQE IA\ ILKKLDHPNW\KLVEVL\DDPNEDHLYMV 
F\ELVNQGPVMEVPTLKPLSET)QARFYFQDLIKGIEYLHYQKII 
H\RD I KPSNLLVGEDGH I KI ADFG VSNE FKGS DALLSNTVGTPA 
FMAPE S LSETR K I FSGKALDVWAMGVTLYCFVFG * CPFMDER IM 
CLHS K I KSQALE FPDQ PD I AEDLKDLI TRMLDKNPESRI WPEI 
KLHPWVTRHGAE PLPSEDENCTLVEVTEEEVEMS VKH I PSLATV 
ILVKTMIRKRSFGNPFEGSRREERSLSAPGNLLTKKPTRECESL 
SELKT*KISPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
* PE PPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTS WL 
PDLVGAPGSHFCFLNIALLRYNSHTM • 


53B2" 


153 6 


203 


GARGSQQDAPALQEAEVRGPERAQ PARGRMTKARL FRLWLVLGS 
VFMILLIIWWDSAGAAHFYLHTSFSRPHTGPPLPTPGPDRDRE 
LTADS DVDE FLDKFLS AGVKQSDLPRKETEQP PAPG SMEE SVRG 
YDWSPRDARRSPDQGRQQAERRSVLRGFCANSSLAFPTKERPFD 
DIPNSEIiSHLIVDDRHGAIYCYVPKVACTNWKRVMIVLSGSLLH 
RGAPYRDPLRI PREHVHNASAHLTFNKFWRRYGKLSRHLMKVKL 
KKYTKFLFVRDPFVRLISAFRSKFELENEEF/ * PQVRRAHAAAV 
RQPHQ PARLGARGLPRWPQ\ VSFANF IQYLLDPHT3KLAP FNEH 
WRQVYRLCH PCQ I DYD FVGKLETLDEDAAQLLQLLQVDLAAPLP 
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SEQ 
ID 
NO: 


Predi cced 
beginning 
midfoot* 4 rff* 

location 
corr e spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I^lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q^Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PELPGTGPPSSWEEDWFAKIPLAWRQQltYKliYEADFVLFGypKP 
ENLLRJD 


5383 


45 

• 


5250 


VERLLGCRNSKRTWRMLISKNMPWRRLQGISFGMYSAEELKKLS 
VKSITNPRYLDSLGNPSANGLYDLALGPADSKEVCSTCVQDFSN 
CSGHLGHIELPLTVYNPLLFDKLYIJ^LRGSCLNCHMLTCPRAVI 
HLLLCQLRVLEVGALQAVYELERILSRFLEENADPSASEIREEL 
EQYTTEIVQNNIiLGSQGAHVKtTV'CESECSKLIALFWKAHMNAKRC 
PHCKTGRSWRKEHNSKLTITFPAMVHRTAGQKDSEPLGIEEAQ 
IGKRGYLTPTSAREHLSAIiWKNEGFFLNYLFSGMDDDGMESRFN 
PSVFFLDFLWPPSRSRPVSRLGDQMFTNGQTVNLQAVMKDWL 
IRKLLALMAQEQKLPEEVATPTTDEEKDSL I AIDRS PLSTLPGQ 
S L I DKL YN I W IRLQSH VN I VFDS EMDKLMMD KY PG I RQ I LEKKE 
GLFRKHMMGKRVDYAARSVICPDMYINTNEIGI PMVFATKLTYP 
QPVTPWNVQELRQAVINGPNVHPGASMVINEDGSRTAI>SAVD^4T 
QREAVAKQKLTPATGAPKPQGTK I VCRHVKNGD I LLLNRQPTLH 
RPS I QAHRAR 1 IiPEEKVLRLHYANCKAYNADFDGDEMNAHFPQS 
EOGRAEAYVLACTDQQYLVPKDGQPLAGLIQDHMVSGASMTTRG 
CFFTREHYME LVYRGLTD KVGR VKLLS PS I LKPFPLWTGKQWS 
TLLINI I PEDHIPLNLSGKAKITGKAWVKBTPRSVPGFNPDSMC 
ESQ VI IREGEUjCGVLDKAHYGSS AYGLVKCCYE I YGGETSGKV 
LTCLARLFTAYLQLYRGFTLGVEDILVKPKADVKRQRI I EESTH 
CGPQAVRAALNIjPEAAS YDEVRGKWQDAHLGKDQRDFNM3 DI>KF 
KEEVNHYSNE INKACMP FGLHRQFPENTLQLMVQSGAKGSTVNT 
MQISC^LGQIELEGRSTPLMASGKSLPCFEPYEFTPRAGGFVTG 
RFLTG I KPPEFFFHCMAGREGLVDTAVKTSRSG YLQRCI I KHLE 
GLVVQYDLTVRDSDGSVVQFLYGEDGLDIPKTQFLQPKQFPFLA 
S N YEV I MKSQHIjHEVIjSRAD PKKALHH FRA I KKWQS KHPNTLIiR 
RGAFLSYSQKIQEAVKAIiKLESENRNGR/RPWDS/G/RMLRMWY 
ELDEESRRKYQKKAAACPDPSIjSVWRPDIYFASVSETFETKVDD 
YSQEV7AAQTEKSYEKSELSLDRlJRTLLQIi\KWQRSLCEPGEAVG 
LLAAQSIGEPSTQMTI^TFHFAGRGEMNVTLGIPRLREILMVAS 
AKIKTPMMSVPVLNTKKAIiKRVKSLKKQLTRVCLGEVLQKIDVQ 
ESFCMEEKQNKFQVYQLRFOFLPHAYYQQEKCLRPEDILRFMfiT 
R FFKLLMES T KKIOJNKAS AFRNVNTRRATQRDLDNAGELGRSRG 
EQEGDEEEEGHIVDAEAEEGDADASDAKRKEKQEEEVDYESEEE 
EEREGEENDDEDMQEERNPHREGARKTQEQDEEVGL/GH*GGPV 
PSRP PDAAPETHPQ PGAPGA\ EAMERRVQAVRE I H P FI DDYQYD 
TEESLWCQVTVKLPLMKINFDMSSLVVSrAHGAVIYATKGITRC 
LIJaETTrmKNEKELVLNTEGINLPELFKYAEVLDLRRLYSNDIH 
AIANTYG I EAALR V I EKE I KDVFAVYG IAVDPRHI*SLVADYMCF 
EGVYKPLNR FGIRSNSS PLQQMTFETS FQFLKQATMLGSHDELR 
S PSACLWGKWRGGTGLFELKQPLR 


5384 


196 


886 


QSCGQRLPTVL*L*GPPGSCPCII>SI*F\PGRPHAIjPEIRPYINI" 
TILKGDKGDPGPMGLPGYMGREGPQGEPGPQGSKGDKGEMGSPG 
APCQKRFFAFSVGRKTAIiKSGEDFQTLLFERVFVNLDGCFDMAT 
GQFAAPLRGI YFFSLNVHS WW YKETYVH I MHNQKEAVI L YAQPS 

ERSIMQSQSVMLDIiAYGDRVWVRLFKRQRENAlYSNDFDTYITF 
SGHLIKAEDD 


5385 " 


326 


799 


L.M VPRTKKEAPAP PKAEAKA KA1j\ KAKKAVLKD VHSHKKNK I HM 
SPTFRRPKTL*LRRQPKYPWKSTPRRNKLDHHVII KFPLTTE*A 
VKKI ENNS I/LVFT VDVKANKHQI KQAVKK/ LCDIDVA K VNTL IQ 
SDGERKAY VR LiAPD YDAL WATKI GIT 


5386 


326 


799 " 


LMVPRTKKEAPAPPKAEAKAKAL\KAKKAVLKDVHSHKKNKIHM 
SPTFRRPKTL*LRRQPKYPWKSrPRRNKLDHHVIIKFPIiTTE*A 
VKKI ENNSliVFTVDVKANKHQ I KQAVKK/jbCDID VAKVNTLIQ 
SDGERKAYVRIiAPDYDALWATKlGIT 


5387 


2 


2117 


FVVAASGGCWF VLGERRAGSIJjSAS YGT FAMPGMVLFGRRWAIA 
SDDLVFPGFFELVVRVLWWIGILTLYLMHRGKLDCAGGAIjLSSY 
L 1 VLM I LLAWICT VS AI MCVS MRGTI CNPGPRKSMSKLL YIRL 

ALFFPEM vwaslgaawvadgvqcdrtwngi iatvws WI 1 1 AA 

TWSIIIVFDPr^KMAPYSSAGPSHLDSHDSSQLLNGLKTAAT 
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SBQ 
ID 

NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid ! 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C-Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R^Arginine, 
S= Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *^Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SVWETRIKLLCCCIGKDDHTRVAFSSTAELFSTYFSDTDLVPSD 
IAAGLALLHQQQDN I RNNQEPAQ WCHAPGS S QEADLDAELKNC 
HHYMQFAAAAYGWPLYIYRNPLTGLCRIGGDCCRSKNPQTMT/M 
VGGDQLOL/ CTSAP I LHTHRAAVQGLHPRQLPWTRFTELPFLVA 
LDHR KES WVAVRGTMS LQD VLTDLS AES E VLD VECE VQDRLAH 
KGISQAARYVYQRLINDGILSQAFSIAPEYRLVIVGHSLGGGAA 
ALLATMVRAAYPQVRCYAFS PPRGLWS KALQEYSQSFI VS LVLG 
KDVI PRLSVTNLEDLKRR I LRVVAHC!N KPKYKI LLHGLWYELFG 
GNPNNLPTELDGGDQEVLTQPLLGEQSLLTRWSPAYS FSSDS PL 
DSSPKYPPI>YPPGRIIHLQEEGASGRFGCCSAAHYSAKWSHEAE 
FSKILIGPKMLTDHMPDILMRALDSWSDRAACVSCPAQGVSSV 
DVA 


5388 


1569 


753 


TADGGAGGGGRRQAGVRRHYLYPFTGGYRRRRAACQAERPAARS 
KDTDLAAYQKGNLGVQLRNMAQETNHSQVPMLCSTGCGFYGNPR 
TNGMCS VCYKEHLQRQNS SNGRI S P PVQCTDGS VPEAQ SALDST 
c o cvtn ti c r>\7 c MO ct.t .GT? QVA ccnT .D TSVT5 KAVP ETEDVOAS VS 
DTAQQPSEEQSKSLEXNRNKKRIAVSCAGRKWDIiLGLNAGVEMF 
T WYTVTQM YT I ALT ITKQM LKNFVFQQE FKSFGS FHQQLLE YK 
ILEHLQTKN 


5389 


1569 


753 


TADGGAGGGGRRQAGVRRHYLYPFTGGYRRRRAACQAERPAARS 

Kim>LAAYQKGNLGVQLRNMAQETNHSQVPMLCSTGCGFYGN?R 

TNGMCS VCYKEHLQRQNS SNG RI S P PVQCTDGS VP 3AQSALDS T 

SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASVS 

DTAQQ PSEEQS KS LE\NRNKKRI AVS CAGRKWDLLGLNAG VEMF 

•mmvnrrrMiJV'P Tax tttvhmT .ITMB'VT'nrJtrTrvS FfSfi FHODTiTiE YK 
IWyTV 1 yMl 1 1. 1\ \ 1 1 J. J. J\yi*iijl\JNc v r uuc>r ^oruor nwvjJJ-ucj 1 1\. 

ILEHLQTKN 


5390 


217 


1332 


EDPRKLMEDKMWSECEGPEMSLVCLTDFQAHAREQLSKSTRDFI 
EGGADDS ITRDDNI AAFKRI RLRPRYLRDVSEVDTRTTIQGEE I 
S AP I CI APTGPHCLVWPDGEMSTARAAQAA\GI CYITSTFASCS 
LEDIVIAAPEGIJJWFQLYVHPDLQLNKQLIQRVESLGFKALVIT 
LDTP VCGNRRHDIRNQLRRNLTLTDLQSPKKGNAI P YFQMTP I S 
T SLCWNDLSWFQS I TRLP 1 1 LKGILTKEDAELAVKHNVQG 1 1 VS 
NHGGRQLDEVLAS I DALTEWAAVKGKI EVYLDGGVRTGNDVLK 
ALALGAKCIFLGDAI LWAIAS KGEHGVKEVLNI LTNEFHTSMA\ 
T. r vnr t T> o va. c t WDwrT.t/nT^^T? t. 

1j ibLKb VJ-vE, iivtuMiivyr onu 


5391 


| 1 


1292 


VKKAAGRSRGP PTAGGQR CEEAPGTVMERRLGVRAW VKENRGS F 
Q PPVCNKLMHQEQLKVMFVGGPNTRKD YHI EEGEEVF YQLEGDM 
VLRVLEQGKHRDWIRQGEI FLLPARVPHSPQRFANTVGLWER 
RRLETELDGLRYYVGDTMDVLFEKWFYCKDLGTQLAP I IQEFFS 
SEQYRTGKPIPDQLLKEPPFPLSTRSIMEPMSLDAWLDSHHREL 
QAGTPLSLFGDTYETQVIAYGQGSSEGLRQNVDWJLWQLEGSSV 
VTMGGRRLSLGPWMDSLLVLSWGPSY\AW\ERTQGSVALSVT\Q 
DPACKKS PWGEPSCHGLKAATGVPSTLEVPSLPNNS PS PHYLSV 
YCRCVPHRPAHCCHPPSCPSQPRCHAPGRAAAPHLLWQTQPTAL 
PVLPGGLPPAPLLP I PLSLQTQCSTSTPRRPSIKAS 


5392 


1 


1623 


I RGSNAQKWGAS G SGGAG PQP DPAGPGG VPALAAAVLGACE PR 
CAAP CPLPALSRCRGAGSRGSRGGRGAAG SGDAAAAAEW I RKGS 
F IHKPAHGWLHPDARVLG PGVS YWR YMG CI EVLRSMRS LDFNT 
RTQVTREAINRLHEAVPGVRGS WKKKAPNKALAS VLGKSNLRFA 
GMSIS IH I STDGLS LSVPATRQVI ANHHM PS I SFASGGDTDMTD 
YVAYVAKDPINQRACHI LECCEGL\AQS I I STVGQAFELRFKQY 
LHSPPKVALPPERLAGPEESAWGDEEDSLEHNYYNSIPGKEPPL 
GGL VDSRLALTQPCALTALDQG PS PSLRDACSLPWDVGS TGTAP 
PGDGYVQADARGPPDHEEHLYVNTQGLDAPEPEDSPKKDLFDMR 
PFEDALKLHECSVAAGVTAAPLPLEDQWPSPPTRRAPVAPTEEQ 
LRQEPWYHGPJ^RPJUVERMLRALGDFLVRDSVTNPGQYVLTGMH 
AGQPKHLLLVDPEGVVRTKDVLFESISHLIDHHLQNGQPIVAAE 
SBLHLRGWSREP 


S393 


2 


982 


GGDSAGMTMETQMSQNVCPRNLWLLQPLTVLLLLASADSQAAAP 
P KAVLKLE P P WXNVLQ \ EDS VTLTCQGAPQP / ERSDS IQW FHNG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L» Leucine, M=Methionine, N«Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S= Serine, T^Threoni.ie, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








\NLIPTHTQPS\YRFKANNN\DSGBYTCQTGQTSli\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMLRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFS I PQANHSHSGDYHCTGNIGYTLFSSKPVTITV 
QVPSMGSSSPMGIIVAWIATAVAAIVAAWAIjIYCRKKRISAN 
STDPVKAAQFEPPGRQMIAIRKRQLEETNNDYETADGGY1WTLNP 
RAPTDDDKNI YLTLPPNDHVNSNN 


S394 


2 


982 


GGDSAGMTMETQMSQNVCPRNLWLLQPLTVLLLLASADSQAAAP 
PKAVLKLEPPWINVLQA EDS VTLTCQGAPQP/ERSDS IQWFHKG 
\NLIPTHTQPS\YRFKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LSEWLVLQT PHLEFQEGETI MIiRCHS \WRDKP\l»VKVTFFQNGK 
SQKFSHLDPTFSrPQANHSHSGDYHCTGNIGYTLFSSKPVTXTV 
QVPSMGSSS PMGI I VAWI ATAVAAI VAAWAL.I YCRKKRI SAN 
STDPVKAAQFEPPGRQMIAI RKRQLEETNNDYETADGGYMTLNP 
RAPTDDDKNI YLTLPPNDHVNSNN 


5395 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVPI S KSTLSRSLSI^ASDFDGAS 
S SGNPEAVAXiAPDAYSTGSS SASSTIiKRTKKPRPPSLKKKQTTK 
KPTETPPVKETQQE PDEESLVPSGENLASETKTESAKTEGPS PA 
LLEETPIiEPAAGPKAACPLDSESVEGVVPPASGGGRVQNSPPVG 
RKTLPLTTAPEAGEVTPS DSGGQEDS PAKGHS VR^EFD YSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRSPAEPNDIPIAKGTYTFDIDKWDDPNFNPFSSTSKMQESPKli 
PQQS YNFDPDTCDESVDPFKTSSKTPSSPSKS PAS FEI PASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPPVISAVVHATDEEKIiAVTNQKWTCMTVDIjEADKQD 
YPQPSDLSTFVNETKFSS PTEELDYRNS YB I E YME KIGS S I*PQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGIAPNQESHLQVPEKSSQKELEAMGLGTP 
SEAIEITAPEGSFASADALLSRLAHPVSLCGALDYLEPDLAEKN 
PPLFAQKLQREAAHPTDVSIS KTALYSR IGTAEVEKPAGLIiFQQ 
PDLDSALQIARAEI ITKEREVSEWKDKYEESRREVMEMRKIVAE 
YEKT I AQM I EDEQREKSVS \HQTVQQIiVL»EKEQa\ IaADLNSVEK 
\ SIiADLFRRYBIO^KEVLEGFRKNEEVLKRCAOEYl.SRVKKEEOR 
YQAI*KVHA\EEKLDRANAB\ lAQVRGKACXJEQAAHQASIiAERSS 
CRV\DAI>ERTLEQKNKEIBELTKICDE1.IAKMGKS 


S396 


3135 


531 


RASDAKNQEGLLKTRRKSTDSVPISKSTLSRSJLSLQASDFDGAS 
SSGNPEAVALAPDAYSTGS SS ASSTLKRTKKPRPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
IiLEETPIiEPAAG PKAACPLDS ES VEG WP PASGGGR VQNS PF VG 
RKTLPI/ITAPEAGEVTPSDSGGQEDS PAKGHSVRIiEFDYSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPAS? 
PRSPAEPNDIP IAKGTYTFDI DKWDDPNFNPFSSTSKMQESPKL 
PQQS YNFDPDTCDES VDPFKTSSKTPS SPSKSPASFE I PASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQD? 
TPAATPETPPV ISAWHATDEEKLAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSSPTEEIjDYRNSYEIBYMEKIGSSLPQD 
DDAPKKQAIjYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVI^AAKNQHPVPRGLAFNQESHIjQVPEKSSQKELEAMGLGTP 
SEAIEITAPEGS FASADALLSRLAHPVSLCGALDYLE PDLAEiCN 
PPLFAQKLQREAAHPTDVSISKTALYSRIGTAEVEKPAGLLFQQ 
PD LDS ALQ IARAE I ITKEREVSEWKDKYEESRREVMEMRKIVAE 
YEKTIAQM I EDEQREKSVS\HQTVQQLVLEKEQA\LADLNSVEK 
\ S LADLFRR YEKMKE VIi EGFRIO^E VL KRCAQE YIiSRVKKEEQR 
YQAIiKVHA\EEKLDRANAE\ I AQVRGKAQQEQAAHQASLAERSS 
CRV\DALERTLEQKNKE IEELTKI CDEI» I AKMGKS 


5397 


3135 


531 


RASDAKNQEGLliNTRRKSTDSVPISKSTliSRSLSLQASDFDGAS " 
SSGNPEAVALAJPDAYSTGSSSASSTLJOiTKKPRPPSLKKKQTTK 
KPTETPP VKETQQE PDEESLVPS GENLiASETKTES AKTEGPS PA 
LLEET PLE PAAGPKAAC PLDS ES VEG WPPASGGGRVQNSPP VG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRSPAEPNDI P IAKGT YTFD I DKWDDPNFNP FSSTSKMQESPKI* 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L.=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPPVISAVVHATDEEKLAVTNQKWTCMTVDIjEADKQD 
YPQPSDLSTFWETKFSSPTEELDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRM5ESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGbAPNQESHLQVPEKSSQKELEAMGLGTP 
SEAI E I TAPEGS FASADALIiSRLAHPVSI»CGALDYLEPDLABKN 
P PLFAQKIjQREAAHPTDVS I S KTALYSR I GTAE VE KPAGLL FQQ 
PDLDS ALQ I ARAE 1 1 TKERE VS EW KDK YEESRR EVMEMRK I VAE 
YEKT IAQM I EDEQREKSVS \HQTVQQI*VLEKEQA\IiADIiNSVEK 
\SLADLFRRYEKMKEVIiEGFRKNEEVIiKRCAQEYLSRVKKEEQR 
YQAI>KVHA\EEKIjDRANAE \ IAQVRGKAQQEQAAHQASIiAERSS 
CRV\DALERTLEQKNKEIEE1jTKICDE1iIAKMGKS 


539B 


56 


5426 


SGEVCRMESNFNQEGVPRPSYVFSADPIARPSEINFDGIKIiDLS 
HEFSIiVAPNTEANS FESKDYLQVCLRI RPFTQSEKELESEGCVH 
ILDSOTWLKEPQCILGRI>SEKSSG\QM\AQKFSFFPGFLGPAT 
TQKEFFQGCIMHP\VKDLLKGQSRIiIFTYGIiTNSGKTYTFQGTB 
ENIRILPRTLNVLFDSLQERLYTKMNLKPHRSRBYLRLSSEQEK 
EEI ASKSALLRQIKEVTVHNDSDDTLYGSLTNSLNISEFEES I K 
DYEQANLNMANS I KFS VWVS FFE I YNEYI YDLFVPVSS KFQKR K 
MLRL»S QDVKG YS FI KDLQWIQVSDS KEAYRLLKLG I KHQSVAFT 
KLNNASSRSHS I FTVKI LQ IEDSEMSRVIRVSELSLCDLAGSER 
TMKTQNEGERLRETGNINTSLLTLGKCINVLKNSEKSKFQQHVP 
FRESKLTHYF/QSFFNGKGKICMIVNISQCYIAYDETLNVLKFS 
AIAQKVCVPDTLNSSOEKLFGPVKSSQDVSLDSNSNSKILNVKR 
ATI SWENS LEDLMEDEDLVEELENAEETED/VGETKLLDEDLDK 
TLEENKAF I SHEEKRKL.LDL I EDLKKKBINEKKEKLTLEFKI RE 
EVTOEFTOYWAOREADFKETIjLQEREI I»EENAERR1iAI FKDLVG 
KCDTREEAAKDICATKVETEEATACLELKFNQIKAEIAKTKGEL 
I KTKEELKKRENESDSLIQEI.ETSNICKI ITQNQRI KEL INI I DQ 
KEDTINEFQNLKSHMENTFKCNDKADTSSLI INNKLI CNETVEV 
PKDSKSKI CSERKRVNENELQQDEPPAKKGS I HVSSAITEDQKK 
SEEVRPNIAEIEDIRVLQENNEGLRAFLLTIENELKNEKEEKAE 
LNKQIVHFQQHUSI,SEKKNLTL,SKEVQQIQSNYDlAIAEIiHVQK 
SlC^QBQEEKIMKLSNEIETATRSITNNVSQIKIiMHTKIDEXRTL 
DSVSOISNIDLIiNIfRDLSNGSEEDNLPNTQLDLLGNDYIjVSKQV 
KEYRIQEPNRENSFHSSIEAIWEECKEIVKASSKKSHQIEEXEQ 
QIEKLQAEVKGYKDENWRLKEKEHKNQDDLLKEKETLIQQLKEE 
LQEKNVTLDVQIQHWEGKRALSELTQGVTCYKAKIKEI.ETII.E 
TQKVERSHSAKLEQDILEKESI ILKLERNLKEFQEKLQDSVKNT 
KDLNVKELKXKEBITQLTNNLQDMKHLLQLKEEEEETNRQETEK 
LKEELSASSARTQN\LWADIiQRKEEDYADLKEKLTDAKKQIKQV 
QKEVS VMRDEDKLLRI KINEIiEKKKNQCSQEI»DMKQR\ TIQQLK 
EQLINQKVEEAIQQYERACKDLNVKEKIIEDMRMTLEEQEOTQV 
EQDQVI/\EAKLHEVERIATELDRWRVKCNDLErKNNQRSNKEHE 
NNTDVLOKIiTWbQDELQES EQK YNADRKKWLEEKMML I TQAKEA 
ENIRNKEMKKYAEDRERFFKQQNEME I LTAQLTE KDSDLQKWRE 
ERDQLVAALE IQLKAIiISSNVQKDNE I EQLKRI I SETSKIETQI 
MDI KPKRI SSADPDKLQTEPLSTSFE I SRNKI EDGSWLDS CEV 
STENDQSTRFPKPELEIQFTPLQPNKI4AVKHPGCTTPVTVKIPK 
ARKRKSNEMEEDIiVKCENKKNATPRTNLKFPlSDDRNSSVKKEQ 
KVAIRPSSKKTYSLRSQASIIGVNLATKKKEGTLQKFGDFLQHS 
PS I LQS KAKKI I ETMSSSKLSNVEASKENVSQPKRAKRKLYTSE 
IS S P I DI SGQ VI LMDQKMKESDHQ 1 1 KRRLRTKTAK 


S399 


70S 


230 


GPRMAKFLSQDQINBYKECFSbYDKQQRGKIKATDLMVAMRCLG 
ASPTPGEVQRHIiQTHGIDGNGEIjDFSTFIiTIMHMQIKQEDPKKE 
ILLAMLMVDKEKKGYVMASDLRS KLTS LGEKI/THKEV \ DDL.FRE 
\ADIEPNGKVKYDEFIHKITSYLDGTY 


5400 


931 


248 


SHCS SGME I PPTN YP AS RAA1»VAQN Y INYQQGTPHRVFEVQKVK 
QASMEDI PGRGHKYRLKFAVBEI I QKQVKVNCTA3 VLYPS TGQE 
TAPEVNFTFEGETGKNPDEEDNTFyQRLKSMKEPLEAQNI\PDN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline # Q=Glutamine, R=Arginine, 
S=Serine, T>Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FGNVSPEMTLVLHLAWVACGY I IWQNSTEDTWYKMVKIQTVKQV 
QRNDDF I ELDYTI LLHNIAS QE 1 I PWQMQVLWHPQYGTKVKHNS 
RLPKEVQLE 


5401 


3 


1360 


TGWS YGPTTSLAFLAPRDFP FP PKLIiI HPQAWRLSCGAGSMGS 
QAAAE WRN WASWEG S S S LSG CSMGCFKDDR I VFWTWMFSTY FME 
KWAPRQDDMLFYVRRKLAYSGSESGADGRKAAEPEVEVEVYRRD 
SKKLPGLGDPDIDWEESVCLNIjILQKLDYMVTCAVCTRADGGDI 
H I H KKKSQQV FAS P S KH PMDS KGEESK I S YPNI FFM I D S F\ BE \ 
V FS DMTVG KG EMVCVEL VAS DKTNTPQG VI FQGS IR YEALKKVY 
DNRVSVAARMAQK\MS FGFS KYSNMEF\VR\MKGPQGKGHAEMA 
VSRVSTGDTS PCGTEEDSSPAS PMH ER VTS FS TP PTP E RNNRPA 
FFSPSIjKRKVPRNRIAEMKKSHSANDSEEFFREDDGGADLHKAT 
NLRSRSLSGTGRSLVGSWLKLNRADGNFLLYAHLTYVTLPLHRI 
LTDILEVRQKPILMT 


5402 


3445 


1563 


GECFIMAAWQQNDLVFEFASNVMEDERQLGDPAIFPAVIVEHV 
PGADI LNS YAGLACVE EPNDM I TESSLDVAEEE I IDDDDDD I TL 
TVEASCHDGDET1 ETI EAAEALLNMDSPGPMLDEKR INNNIFS S 
PEDDMWAP VTHVS VTLDGI PEVMETQQVQEKYAES PGASSPEQ 
PKRKKGRKTKPPRPDS PATPPN I S VKKKNKDGKGNT I YLWE FLIj 
ALLQDKATCPKYIKWTQREKGI FKLVDSKPVSRLMRKHKNKP\D 
MN YEPMGRALRY YYQRG I LtAKVEGQRLVYQ FKEMPKDL I YINDE 
DPSS S IESSDPSLSS SATSNRNQTSRSRVSSS PGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHWQ 
P VQAVPEGEAARTS TMQDE TLNS S VQS IR \TIQAPTQVPWVS P 
RNQQA LHTVTLQTVPLTT VI AS TDPSAGTGSQKFILOAI PS SQP 
MTVLKENVMLQSQKAGS PPS IVLG PAR V\QQVIiTSNVQT I CNGT 
VSV\ASSPS FS \ ATAPVVTLFLLGSSQLVAHPPGTVITS VI ICTQ 
ETKTLTQE VEKKE SEDHLKENT EKTEQQF Q P YVMWS S S NG FTS 
QVAMKQNE LLE PNS F 


5403 


3445 


1563 


GECFI MAAWQQNDLVFEFASNVMEDERQLGDPAI FPAVI VEHV 
PGADI LNS YAGLACVE E PNDM I TES S LDVAEEE I IDDDDDD ITL 
TVEAS CHDGDETI ETIEAAEALLNMDSPGPMLDEKRINNNI FSS 
PEDDMWAPVTHVSVTLDG I PEVMETQQVQEKYADSPGASS P EQ 
PKRKKGRKTKPPRPDSPATTPNISVKKKNKDGKGNTIYLWEFLi 
ALLQDKATCPKY I KWTQREKG I FKLVDS KPVSRLWRKHKNKP\D 
MNYEPMGRALRY YYQRG I IiAKVEGQR L V YQ FKEMPKDLI YINDE 
DPSSS I ESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPS EVJLRTVQPTQSPyPTQLFRTVHVVQ 
P VQAVPEG EAARTS TMQDETLNSSVQS I R\TIQAPTQVPWVSP 
^QQ\LHTVTXQTVPLTTVIASTDPSAGTGSQKFILQAIPSSQP 
MTVLKENVMLQS QKAGSPP S I VLGPARV\QCVL»TSNVQTI CNGT 
VSV\ ASSPSFS \ATAP WTLFLLGSSQLVAHPPGTVI TSVI KTQ 
ETKTLTQEVEKKES EDHLKENTE KTEQQ PQP YVMWS S SNG FTS 
QVAMKQNBLLEPNS F 


5404 


187 


1111 


LPVTL I FAKMKTLQSTLLLLLL VPL I KPAPPTQQDS R 1 1 YD YGT 
DNFEES I FSQDYEDKYLDGKNI KEKETVI I PNEKS LQLQKDE AI 
TPLPPKKENDEMPTCLLCVCLSGSVYCEEVDIDAVPPLPKESAY 
L YARFNKIKKLT\ AKDFADI PNLRRLDFTGNLIED IEDGTFS KL 
S LVEELSLAENQLLKL P VLPPK LTLFNAXYNKI KS RG I KANAFK 
KLNNLTFLYLDHNALESVPLNLPESLRVIHLQFNNIASITDDTF 
CKANDTS Y IRDRI EE I RLEGNP I VLGKHPNS FI CLKRLPIGS YF 


54 05 


2199 


1220 


QNSRSLHMDPQNQHGSGSSLWIQQPSLDSRPRLDYEREIQPTA 
I LS LDQIKAI RGSNEYTEGPSVVKRPAPRTAPRQEKHERTHE 1 1 
P INVNNN YBHRHTSHLGHAVLPSNARGPILSRSTS TGSAASSGS 
NSSASSEQGLLGRSPPTRPVPGHRSERAIRTQPKQLIVDDLKGS 
LKEDLTQHKF I CEQCGKCKCGE CTAPRTLPS CLACNRQ CLCSAE 
SMVEYGTCMCL\ VKG I F YHCSNDDEGDS YSDNPCS CSQSHCCSR 
YLCMGAMSLFLPCLLCYPPAKGCLKLCRRCYDWIHRPGCRCKNS 
NTVYCKLESCPSRGQGKPS 


5406 


279 


2732 


RWRTYNVEGPLTFMDVAIEFCLEEHQCLDTAQQNLYRNVMLENY 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, P=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine , M=Methionine , N=Asparagine, 
P=Proline, Q^GlUtamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RNLVFLG/ 1 IAVSKPDLITCLEQEKEPWEPMRRHBMVAKPPVMC 
SH FTQDFWPEQHI KDP FQKATLRRYKNCEHKN VHLKKDHKSVDE 
CKVHRGG YNGFNQCLP ATQS K I FLFDKCVKAFH KFSNSNRH KI S 
HTEKKLFKCKECGKSFCMLSHLAQHKIIHTRVNFCKCEKCGKAF 
NCPS I ITKHKRINTGEKPYTCEECGKVFNWSSRLTTHKKNYTRY 
KLYKCEECGKAFNKSS ILTTHKI IRTGEKFYKCKECAKAFNQSS 
NI.TEHKKIHPGEKPYKCEECGKAFNWPSTLTKHKRIHTGEKPYT 
CEECGKAFNQFSNLTTHKRlHTA\EKFYKCTECGEAFSRS\SNIi 
TKHKEIHTEKKPYKCEECGKAFKWSSKLTEHKLTHTGEKPYKCE 
KCGKAFNCPS I ITKHNRINTGEKPYTCEECGKVFNWSSRLTTHK 
KN YTRYKLYKCEECGKAFNKSS 1LTTH KKI H I EKKFYKCEE CGK 
AFKWS S KLTEHKITHTGEKPYKCEECG KAFNHFS I LTKHKRI HT 
GEKPYKCEECGKAFTQSSNLTTHKKIHTGEKFYKCEECGKAFTQ 
SSNLTTHKKIHTGGKPYKCEECGKAFNQFSTLTKHKIIHTEEKP 
YKCEECGKAFKWS STLTKHKI IHTGEKPYKCEECG\KAFKLSST 
LSTHKI IHTGEKP YKCEKCGKAFNRPSNLI EHKKIHTGEQPYKC 
EECGKAFNYS SHLNTHKR IHTKEQP YKCKECGKAFNQYSNLTTH 
NKIHTGEKLYKPEDVTVILTTPQTFSNI K 


5407 


3 


659 


RPRRRQSSCCTGWLAGWLLRAAPRFCRRTETDMEQGKGLAVLIL 
AI I LLQGTLAQS I KGNHLVKVYDYQEDGS VLLTCDAEAKN ITWF 
KDG KM I GFLTE DKKKWNLGSNAKDPRGM YQCKGSQNKSKPLQVY 
YRMCQNCI ELNAATI SG FLFAE I VS I FDliAVGV Y F I AGTGME FR 
OS \RAS DKQTLLP \NDPAPTQPLKDPRKMTQYSHLQGN \QLRRN 


S40B 


2145 


6128 


QGSKGTCHPQAQQPWDEGVWOEAPSQSEPWGQSQEPPTMPQRIjP 
HARQHTPIjPLGSADYRRWSVRPQGPHRDPKDSRDAAKREQGSrj 
APRP VP AS RGG KTLCKGYRQAPPGP PAQ FQRP I CSASPPWASRF 
STPCPGGAVREDTYPVGTQGVPSLALAQGGPQGSWRFLEWKSMP 
RLPTDLDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGE 
VRNKDMSWPEEMS FI ANSS KI DRHKVPTEKGATGLSNLGNTCFM 
NSSI QCVSNTQPLTQYFI SGRHLYELNRTNPIGMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFL 
LDGLHEDLNR VHE KP YVELKDS DGRPDW EVAAEAVJDNHLiRRNRS 
I WDLFHGQLRS QVKCKTCGH I SVRFDPFNFLSLPLPMDSYMHL 
EI TV I KLDGTTP VR YGLRLNMDEKYTGLKKQLSDLCGLNSBQI L 
LAEVHGSNI KNFPQDNQKVRLSVSGFLCAFEI P VPVSPISASS P 
TQTDFSSS PSTNEMFTLTTNGDLPRP I F I PNGMPNTWPCGTEX 
NFTNGMVNGHMPSLPDSPFTGYIIAVHRKMMRTELYFIiSSQKNR 
PSLFGMPL>IVPCTVHTRKKDliYDAVW I QVSRLASPLP PQEASNH 
AQDCDDSMGYQYPFTLRVVQKDGNSCAWCPWYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTALHI»RYQTSQERWDEHESVEQSRRAQ 
VEPINLDSCLRAFTS EEELG ENEMY YCS KC KTHCLATKKLDLV7R 
LPPILIIHLKRFQFVNGRWIKSQKIVKFPRESFDPSAFLVPRDP 
ALCQHKPLTPQGDELSEPR I LiARE VKKVDAQSS AGEEDVLLS KS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKMSSPNSSPRTLGRS 
KGRLRLPQ I GS KNK1»S S SKENLDAS KENGAGQ I CEIADALSRGH 
VLGGSQ PELVTPQDHEVALANG FLYEHE ACGNGCGNGYSNGQLG 
NHSBEDSTDDQREDTRIKPIYNLYAISCHSGILGGGHYVTYAKN 
PNCKW Y CYNDS SCKELHPDE I DTDS AYI LFYEQQGI D YAQ FLP K 
TDGKKMADTSSMDEDFESDY\EKYCVLQ 


5409 


2745 


6128 


QGSKGTCHPQAQQPWDEGVWQEAPSQSBPWGQSQEPPTMPQRLP 
HARQHT PL PLG S AD Y RRWS VR P QG PHRD P KDS RDAAKREQGS L 
APRP VPASRGGKTLCKG YRQAP PGP PAQ FQRP I CSAS PPWASRF 
STPCPGGAVREDTYPVGTQGVPSLALAQGGPQGSWRFLEWKSMP 
RLPTDLDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGE 
VRNKDMSW P EEMSF I ANS SKI DRHKVPTE KGATGLSNLGNTCFM 
NSS IQCVSWTQPLTQYFI SGRHLYELNRTNP IGMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWT I AKYAPRFNG FQQQDS QELLAFL 
LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAWDNHLRRNRS 
IWDLFHGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 
EITVIKLDGTrPVRYGLRLNMDEKYTGLKKQLSDLCGLNSEQIL 
LAEVHGSNI KNFPQDNQKVRLSVSGFLCAFEI PVP VS PISASSP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amine acid 
sequence 


Predicted end 

nucleotide 

location 

c or r e spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
<A«= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H«Histidine, I=*Isoleucine, K= Lysine, 
L=Leucine, M=»Methionine , N^Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, ^Threonine, V:= Valine, 
w=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








TQTDFSSSPSTNEMFTLTTNGDLPRPIFIPNGMPNTWPCGTEK " 
NFTNGMVNGUMPSLPDS PFTGYI I AVHRKMMRTELYFLSSQKNR 
PSLFGMPLIVPCTVHTRKKDLYDAVWlQVSRliASPLPPQEASNH 
AQDCDDSMGYQYPFTLRVVQKDGNSCAWCPWYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTALHLRYQTSQERVVDEHESVEQSRRAQ 
VEPINLDSCLRAFTSEEELGENEMYYCSKCKTHCIATKKLDLWR 
LPPILI XHLKRFQFVNGRWIKSQKIVKFPRES FDPSAFLVPRDP 
ALCQHKPLTPQGDELSE PRI LAREVKKVDAQSS AGEEDVLLS KS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQIGSKNKLSSS KEMLDAS KENGAGQ I CE LADALSRGH 
VLGGSQPELVTPQDHEVALANGFLYEHEACGNGCGNGYSNGOLG 
NHSEEDSTDDQREDTR I KP I YNLYAI S CHSGI LGGGHY VTY AKN 
PNCKWYC YNDS SCKELHPDE I DTDSAYI LFYEQQGI DYAQFLPK 
TDGKKMADTS SMDEDFES DY\EKYCVLQ 


5410 


2 


710 


LRFPGQARHVWLAARMQAPHKEHL YKLIiVIGDLGVGKTS X I KRY 
VHQNFSSHYRATI G VDFAIiKVLHWDPETWRLQLWDI AGQER FG 
NMTRVYYREAMGAFIVFDVTRPATFEAVAKWKNDLDSKLSLPNS 
KP VS VVLIiAN KCDQG KDVLMNNGLKMDQFCKEHGFVG W FETS AK 
ENINIDEASRCLVKHILANECDLMES I epdwkphltstkvasc 
SG\ CAK I Li VGTFAG VW 


5411 


1302 


289 


TGPAAAGRRKALGSFGKPSPVTGLRAARRRRTRPSAPAAPSVGC 
G KRRES DAG AGGE RAS VRTG SGRRGG RTMAGDS EQTLQNHQQ ?N 
GGEFFLIGVSGGTASGKSSVCAKIVQLLGQNEVDYRQKQWIIiS 
QDSF YR VIiTSEQKAKALKGQ FNFDHPDAFDNEIi I LKTLKE ITEG 
KTVQI PVYDFVSHSRKEETVTVYPAD WLFEGI LAFYSQER/ IR 
DLFQMKLFVDTDADTRIiSRRVLKDI S ERGRDLEQI LSSSTLRFV 
KPA\FEEFCLPPK\KYADVIIPR\GADN\RVPINLIVQHIQ\D1 
LNGGPS \NRQTNGCLNGYTPSRKRQASESSSRPH 


5412 


3180 


313 


QGISNFFHKEANFWFBVSG YL ISPLRSPFVDPALE WSLMAS PWN 
KMEGESSR FEIHTPVSDKKKKKCS IHKERPQKHSHE I FRDSSLV 
NEQSQ ITRRKKRKKDFQHL.ISSPLKKSR I CDETANATSTLKKRK 
KRRYSALEVDEEAGVTVVLVDKENINNTPKHFRKD^/DVVCVDMS 
IEQKLPRK\PKTDKFQVLAKSH\AHKSEALHSKVREKKNKKHQR 
KAASWESQRA\RDTLPQSEFPTQEESWLSVGPGGEITELP\ASA 
HKNKSKKKKK3CSSNREYET\LAMPEGSQAGREAGTDMQESQPTV 
GLDDETPQLL.GPTHKKKSKKKKKKKSNHQEFESLAMPEGSQVGS 
E VGADMQES \RPAVGLHGETAGI PAPAYKNKS KKKKKKSNHQEF 
E AVAMPES LESAYPEGSQVGS EVGTVEGS TALKGFKESNSTKKK 

S KKRKLTS vkrar vsgddfs vpsknsestlfds vegdgammeeg 

VKSRPRQKKTQACLAS KHVQEAPRLE P ANEEHNVETAEDSE I R Y 
LSADSGDADDSDADLGSAVKQLQEFIPNI KDRATSTI KRMYRDD 
LERFKEFKAQGVAIKFGKFSVKENKQLEKNVEDFLALTGIESAD 
iCLLYTDR YPEEKS VI TNLKRR YSFRLHIG \ RNIARPWKLI YYRA 
KKMFDVNNYKGRYSEGDTEKLKMYHSLLGNDWKTIGEMVARRSL 
S VALKFSQ IS SQRNRGAWS KS ETRKL I KAVEE VI LKKMS PQEL K 
EVDSKLQENP ESCLS I VREKL YKGISWVEVEAKVQTRNWMQCKS 
KWTEILTKRMTNGRR I Y YGMNALRAKVSL I ERL YE INVEDTNE I 
DWEDLASAIGDVPPS YVQTKFSRLKAV YVPFWQKKTFPEI IDYL 
YETTLPLLKEKLEKMMEKKGTKIQTPAAPKQVFPFRD I FYYEDD 
SEGGGHRKRKRRPRRHAWFTP V I PVLWEAKAGWI I 


5413 


3 753 


1304 


RFPAGVAPRRAMANVSKKVSWSGRDRDDEEAAPLLRRTARPGGG 

tpij^gagpgaarqsprsalfrvghmssvklddellepXdmdpp 
hpfpkei phnekllslkyesldydnsenqlfleeerr i nhtafr 
tvei krwvi caligiltglvacfidi vvenlaglkyrvi kgni d 
kftekgglsfslllwatlnaafvlvgsvivafiepvaagsgipq 

IKCFLNGVKIPHVVRLKTLVIKVSGVILSVVGGLAVGKEGPMIH 

sgsviaagisqgrstslkrdfkifeylrrdtekrdfvsagaaag 
vs aafgap vgg vlfsleegas fwnqf ltwr i ffasmi s tftlnf 

VLS I YHGNM WDLSS PG L INFGR FDS E KMA YT I HEI P VF I AMGW 
GGVLGAVFNALNYWLTMFRIRYIHRPCLQVI EAVLVAAVTATVA 
FVLIYSSRDCQPLQGGSMSYPLQLFCADGEYNSMAAAFFNTPEK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleot ide 
location 
corxe sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q«=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








S WSLFHDP PGSYN PLTLGLFTLVYFFLACWT YGLT VSAG VF I P 
SLLIGAAWGRLFGIS LS YLTGAAI WADPGKYALMGAAAQLGGI V 
RMTLS LTVIMMEATSNVT YGFP IMLVLMTAK I VGDVF I EGL YDM 
HIQLQSVPFI>HWEAPVTSHSI»TAREVMSTPVTCriRRREKVGVIV 
DVLSDTASNHWGFPWEHADDTQPARIjQGIjI LRSQLI VLbKHKV 
FVERSNLGLVQRRbRLKDFRDAYPRFPP IQS IHVSQDERECTKD 
LSEFMWPSPYTVPQEASLPRVFKLFRALGLRHLWVDNRNQWG 
LVTRKDLARYRLGKRGLEELSLAQT 


5414 


2130 


390 


GVASAWDRALFSPLLSPTSRVPRTSPPRCVSTETGRRDRARVPS 
QWCSVLQGKIiPVSGRTSIiACVRSII.LSPASSPRKVGIVGGTGAR 
AGAAPRDHGRVRHRRPS SARRMTRTTGQCIAPRGCQGPRGTRS P 
RSPRSRTRRGCSASPAC1»P /CRSAIiIVAVLCYINF iNYMDRFTV 
AGVLPDI EQFFN IGDSS SGLIOTVPI SSYMVLAPVFGYLGDRYN 
RKYLMCGG I AFWS LVTLGS S FI PGEH FWLI>I>LTRGLVGVG EAS Y 
STIAPTI>IADI*FVADQRSRMLSIFYFAIPVGSGI*GYIAGSKVKD 
MAGDWHWALRVTPGIiGWAVLLLFLWREPPRGAVERHSDLPPL, 
NPTSWWADIjRALARNPSFVIjSSLGFTAVAFVTGSLALWAPAFLL 
RSRWLGETPPCLPGDS CSSSDSLI FGLITCLTGVLGVGIjGVEI 
SRRJjRHSNPRADPLVCATGLLGSAPFLFLSLACARGS I VAT Y I F 

IFIGETLLSMNWAI vadi lly vvi ptrrstaeafqi vlshllgd 

AGS P YL I Gt>I S DRLRRNWPPSFLSEFRALQFSLMLCAFVGAIjGG 
AAFI*GTAHLH 


S41S 


693 


2986 


ippktklelqkh\lttlt\nqeqatifeevqklrprneqrenel 
iisflrcl.feekqkehihigemkqtsqmaaenigselppsatrf 
rldmlknkakrslteslesilsrgnkarglqbhsisvdi.dssls 
stlsntskepsvcekeaiipisessfkllgssedlssdseshlpe 
EPAPLS pqoafrrrantlshfpiecqeppqpargs PGVSQRKLM 

RYHSVSTETPHEK KDFES KANHLGDSGGTPVKTRRHSWRQQ I FL 
RVATPQKACDSSSRYEDYSELGELPPRSPLEPVCEDGPFGPPPE 
EKKRTSRELRELWQKAILQQI LI.LRME KEN QKLQ AS ENDLLN KR 
LKLD YEE I TPCLKE VTTVWEKMLSTPGRS KI KFDMEKMHSAVG Q 
GVP\RHHRGEIWKFLAEQFHLKHQFPSKQQPKDVPYKELLKQLT 
SQQHAILIDLGRTFPTHP YFSAQLGAGQliSLYN I LKAYSLLDQS 
VGYCQGLS F VAG I liLLHM SEE EAFKMLKFIjMFDMGIiRKQYRPDM 
IILQIQMYQLSRL1.HDYHRDLYNHLEEHEIGPSLYAAPWFLTMF 
ASQFPIiG F VARVFDM I FLQGrE VI FKVALSLLGSHKPI*I LQ HEN 
LETIVDFIICSTLPNLGLVQMEKTINQVFEMDIAKQLQAYEVEYH 
VX^EELIDSSPLSDNQRMDKLEKTNSSLRKQNLDLLEQLQVANG 
RIQSLEATIEKLLSS E SKLKQAMLTLELERSALLQTVEEIiRRRS 
AKPSDREPECTQPEPTGD 


5416 


27 


4074 


KSQLFCFWGGKAGDILSGDQDKEQKDPYFVETPYGYQLDLDFLK 
YVDDIQKGNTIKRIjtNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPPPSPQLPKHNLHVTKTLMETRRRLEQERATMQMTPGEF 
RRPRLASFGGMGTTSSLPSFVGSGNHNPAKHQLQKGYQGNGDYG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 
AIALKRLKELEEQVRTI PVLQVKI SVLQEEKRQLVSQLKNQRAA 
SQINVCGVRKRSYSAGNASQIiEQbSRARRSGGBLYIDYEEEEME 
TVEQSTQR I KEFRQL \TADMQALEQKIQDSSCEAS SELRENGE C 
RS VAVGAE E NMND I WYHRGSRSCKD AA VGT LVEMRNCGVS VTE 
AMLGVMTEADKE IELQQQTI ES LKEKI YRLEVQLRETTHDREMT 
KLKQELQAAGS RKKVDKATMAQPI*VFSKWEA WQTRDQMVGSH 
MDLVDTCVGTSVETNSVGISCQPECKNKWGPELPMNWWIVKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
NLNIiKEVRS IGCGIXrSVDVTVCSPKECASRGVNTEAVSQVEAAV 
MAVPRTADQDTSTD LEQ VHQ FTNTET ATI* I ES CTNTCLS TT»DKQ 
TSTQTVETRTVAVGEGRVKDINSSTKTRSIGVGTLLSGHSGFDR 
PSAVKTKESGVGQININDNYLVGLKMRTIACGPPQLTVGLTASR 
RS VGVGDDP VGESLENPQPQAPLGMMTG LDHY I ER I QKLIAEQQ 
TLLAENYSEIAEAFGEPHSC^GSLNSQLlSTbSSINSVMKSAST 
EELRNPDFQKTSLGKlTGSYIiGYTCKCGGLQSGSPIiSSQTSQPB 
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ID 
NO: 


Predicted 

beginning 

nucleotide 
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co r r e spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ~ 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, l=Isoleucine, K= Lysine, 
L=Lsucine, M=Methionine f N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W= Tryptophan, Y= Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QEVGTSEGKPISSLDAFPTQEGTLSPVNIiTDDQIAAGLYACTTJN 
ESTLKS I MKKKDGNKDSNGAKKNLQF VG I NGG YETTS SDDSS SD 
E S S SS ES DDE CDVI E Y P LEEEE EE EDE DTRGMAEGHHAVNI EGIi 
KSARVEDEMQ VQECEPEKVE I RER YELS EKMLSACNLLKNT IND 
PKALTSKDMR FCLNTLQHEWFRVSSQKSA X PAMVGD Y I AAFE A I 
SPDVLRY V I N LADGNGNTALHYS VSHSNFE I VKLLIiDADVCNVD 
HQNKAGYTP I MLAALAAVEAEKDMRI VEELFGCGDVNAKASQAG 
QTALMLAVSHGR I DM VKGLLACGAD VN I QDDEGSTALMCASEHG 
HVEI VKLLLAQPGCNGHLEDNDGSTALS IALEAGHKDI AVLLYA 
HVN FAKAQS PGTPRLGR KTS PGPTHRGS FD 


5417 


27 


4074 


KSQLFCFWGGXAGDILSGDQDKEQKDPYFVETPYGYQLDLDFLK 
Y VDD I QKGNT I KRLN I QKRRKPS V PCP E PRTTSGQQG I WTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPPPSPQLPKHNLHVTKTLM5TRRRLEQERATMQMTPGEF 
RRPRLAS FGGMGTTSS LPSF VGSGNHNPAKHQLQNGYQGNGDYG 
S YAP AAP TTS S MGS S I RHS P LS SG I STP VTNVS PMHLQH I REQM 
AIALFCRLKELEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRAA 
SQINVCGVRKR5YSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TVEQSTQRIKEFRQL\TA3MQALEQKIQDSSCEASSELRENGEC 
RSVAVGAEENI4NDIWYHRGSRSCKDAAVGTLVEMRNCGVSVTE 
AMI^VMTEADKEIELQQQTIESLKEKIYRLEVQLRETTHDREMT 
KLKQELQAAGSRKKVDKATMAQPLVFSKVVEAVVQTRDQfWGSH 
MDLVDTCVGTS VETNS VG I SCQPECKNKWGPELPMNWW I VKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
NLNLKEVRS I GCGDCSVDVTVCS PKECASRGVNTEAVSQVEAAV 
MAVPRTADQDTS TDLEQVHQFTNTETATL I ES CTNTCLS TLDKQ 
TSTQT VETRT VAVGEGRVKD I NS ST KTRS IG VGTLLSGHSG FDR 
PSAVKTKESGVGQININDNYXVGLICMRTIACGPPQJ/rVGLTASR 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLLAEQQ 
TLLAENYSELAEAFGEPHSQMGSLNSQLI STLSS INS vmksast 
EELRNPDFQKTSLGKITGSYLGYTCKCGGLQSGSPLSSQTSQPE 
QSVGTSEGKP ISS LDAFPTQEGTLS P VNLTDDQI AAGL YACTNN 
ESTLKS IMKKKDGNKDSNGAKKNLQFVG INGG YETTS SDDSS SD 
ESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQECEPEKVEIRERYELSEKMLSACNLLKNTIND 
PKALTSKDMRFCLNTLQHEWFRVSSQXSAIPAMVGDYIAAFEAI 
SPDVLR YV INLADGNGNTALHY S VSH SNFE I VKLLLDADVCNVD 
HQNKAGYTP I MliAALAAVEAEKDMR I VEELFGCGDVNAKASQAG 
QTAI^LAVSHGRIDMVKGLLACGADVNIQDDEGSTALMCASEHG 
HVB I VKLLLAQPG CNGHLEDNDGSTALS I ALEAGHKDIAVLLYA 
HVNFAKAQS PGTP RLGR KTS PG PTHRGSFD 


5418 


24 


1133 


S VPRAGGDME TGAAELYDQALLG I LQH VGNVQDFLRVLFG FL YR 
KTDFYRLLRHPSDRMGFPPGAAQALVLQVFKTFDHMARQDDEKR 
RQELEEKIRRKEEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTEL 
DGHQE VEKVQ PPG P VKEMAHGS QEAE APGAVAGAAE VPR\ E P ? I 
LPR I QEQ FQKNPDS YNGAVRENYTWS QDYTDLE VRVPVPKH WK 
GKQVS VALSSS S I R VAMLEENG ER VLM2GKLTHKINTESS LWS L 
EPGKCVLVNLSKVGEYWWNAILEGEEPIDIDKINKERSMATVDE 
EE<3AVLDRLTPDYHQKLQGKPQSHELKVHEMLKKGWDAEGSPFR 
GQRFDPAMFNISPGAVQF 


5419 


139S 


259 


GTHPLDPDLVSRTSVQGPLMTMACPGMSDTEESPFLGPRAAEEG 
SESEACEAFGRRKSEEEGRRSDTSGFGRSRKHKVNWKHPERADA 
I03PASLPQC/I^P/DCVRPAQPSSKYCSDDCGMia^AANRIYEIL 
PQRI QQWQQS P C I AEEHG KKLLER I RREQQSARTRLQEMERRFH 
ELEAI ILRAKQQAVREDEESNEGDSDDTDLQIFCVSCGHP INPR 
VALRHMERCYAKYESQTS FGSMYPTR I EGATRLFCDVYNPQS KT 
YCKRLQVLCPEHS RDPKVPADE VCGC PL VRDVFELTGDFCRLP K 
RQCNRHYCWEIO.RRAEVDLERVRVWYKLDELFEQERNVRTAMTN 
RAGLLALMLHQTIQHDPLTTDLRSSADR 


5420" 


117 


1733 


NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVWVLGG " 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHBRIR 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl al anine , G=Glycine, 
HsHistidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S^Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECI I STLLFATIiYILCHIFLiTRFKKPAEFTTXGMMKMPPSTRL/ 
LLELCTFTLAIALGAVLLLPFS 1 1 SN3 VLLS LPRNY Y I QWLNGS 
LIHGI»WNLVFLFSNI>SLI FLMP FAYFFTES EG FAGSRKGVLGR V 
YETVVMLMLLTLjLiVIjGMVWVASAI VDKNKANRESIjYDFWEYYLjP 
YLYSCISFI^VLLLLVCTPLGLARMFSVTGKLLVKPRLLEDLEE 
QL YCS AFEEAALTRRI CN PTSCWIiPLDMELLHRQVLALQTQRVIj 
LEKRRKASAWQRNLGYPLAMLCLLVLTGLSVLIVAIHILELLID 
EAAMPRGMQGTSIiGQVS FS JCLGS FGAVIQ WL I F YLMVSS WGF 
YS S PLFRSLR PRWHDTAMTQI IGNCVCLLVLSSALPVFS RTLGL 
TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTIiCLVKTFTAAV 
RAEIiIRAFGERE 


5421 


117 


1733 


NEAGGACP FKGGASGRL Y LSPRLPR VS VAGCEER? LGW VW VLGG 
GGFLPARPPRAQRHIiGFSHAEQSMEAPDYEVI*SVREQIiFHERIR 
ECI I STLLFATLYILCHI FLTRFKKPAEFTT\ GMMKMPPSTRL/ 
LLELCTFTIiAIALGAVLLbPFS 1 1 SNEVLLSLPRNYYIQWLNGS 
LIHGLWNLVFL FSNLS I>I FLMPFAYFFTES EG FAG S RKG VLGR V 
YET VVMLMLLTtiLVLGMVWVAS AI VDKNKANRE SL YDFWE Y YLP 
YLYS C I S FXGVLI*LLVCTPLGLARM FSVTGKt»I»VKPRLLEDLEE 
QLYCSAFEEAALTRRICNPTSCWLPLDMELLHRQVLALQTQRVL 
LEKRRKASAWQRNI^GYPLAWLCLLVLTGLSVLIVAIHILELLI d 
EAAM PRGMQGTS LGQVS FSKLGSFGAV IQWLI FYLMVSSWGF 
YSS PLFRSIiRPRWHDTAMTQ I IGNCVCLLVIiSSALPVFSRTLGL, 
TRFDLLGDFGR FNWLGN F Y I VFL» YNAAFAGLTTLCLVKT FTAAV 
RAEIiIRAFGERE 


5422 


3 


1263 


SCGESIjPTWLAGASRPGIGRKGGAWGGRGGSSPAQVLLSPGPVF 
KAGCNW WHIiSRDQAG VQRCDLGSSQ P P P LGFKR FS CLSLPSS WD 
YRSTVLCVSKMEADLSGFNIDAPRWDQRTFLGRVKHFLNITDPR 
TVFVSERELDWAKVMVEKSRMGWPPGTQVEQI.T.YAKKLYDSAF 
HPDTGEKMNVI GRMSFQLPGGMI ITGFMLQFYRTMPAVI FWQWV 
NQS FNAI*VNYTNRNAASPTS VRQMALS Y FTATTTAVATAVGMNM 
LTKKAP PLVGRW VP FAAVAAANCVN I PMMRQQELZ KGI CVKDRN 
ENEIGHSRRAAAIGlTQWISRITMSAPGMIIiLPVlMERIiEKLH 
FMQKVKVL/ SAPLQVMIiSGCFLI FMVPVACGLFPQKCEIjPVS YL» 
BPKLQDTI KAKYGELEPYVYFNKGL 


5423 


3186 


905 


GVSMAIiGEEKAEAEASEDTKAQS YGRGS CRERELDI PGPMSGEQ 
PPRLKAEGGIiIS PVWGAEG I PAPTCWIGTDPGGPSRAHQPOASD 
ANREPVAERSEPALSGLPPATMGSGDLLLSGESQVEKTKLSSSE 
EFPQTLSLPRTTICSGHDADTEDDPSIiADIjPOAIiDLSQQPHSSG 
LS CLS QW KS VLS PGS AAQPS SCS 1 S ASSTGSSLyQGHQERAEPRG 
GSLAKVSS SLEPWPQEPS S WGIjGPRPQ WSPQP VFSGGDASGI, 
GRRRLS FQAE YWACVLPDSLPPS PDRHS PLWNPNKE YEDIjIiDYT 
YPLRPGPQL PKKLDS RVFADP VXiQDSGVDLDSFS VS PASTLKS P 
TWSPNCPPAEATAJUPFSGPREPSLKQWPSRVPQKQGGMGLASW 
SQLASTPRAPGSRDARlTORREPAI>RGAKDRLriGKHI»DMGSPQIi 
RTRDRGMPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 
LALPARXTQVSSLVSYLGSISTLVTLPTGDlKGQSPIiEVSDSDG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
EGSUSSSQALGVSSGIiLKTRPSLPARiDRWPFSDPDVEGQLPRK 
GGEQGKESIiVQC \ VKTFC\CQLEELICWL YWV\ AD VTDHGTPAR 
SNLTSbK\SSLQLYRQFKKDI DEHQSLTESVLQKGE I LLQCLLE 
NTPVLEDVLGRIAKQSGELESHADRLYDS I LAS LDMLAGCTL I P 
DKKPMAAMEHPCEGV 


S424 


3186 


905 


GVSMAIiGEEKAEAEASEDTKAQS YGRGSCRERBLiDIPGPMSGEQ 
P PRLEAEGGLi I S P VWGAEGI PAPTCWIGTDPGGPS RAHQPQAS D 
ANREPVAERSEPALSGLPPATMGSGDIiIiIiSGESQVEKTKLSSSE 
EFPQTIiSLPRTTICSGHDADTEDDPSIiADLPQAIjDLSQQPHSSG 
LSCIiSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
GSIAKVSSSLEPWPQEPSSWGIiGPRPQWSPQPVFSGGDASGL 
GRRRLS FQAE YWACVLPDSLPPS PDRHS PLWNPNKEYEDLLDYT 
YPLRPGPQIiPKHLDSRVPADPVLQDSGVDLDSFSVSPASTLKSP 
TNVSPNCPPAEATALPFSGPREPSl*KQWPSRVPQKQGGMGIiASW 
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Ammo acid segment containing signal peptide 
(A-Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F* Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
I J =Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T^Threonine, v=Valine, 
"^Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQI.ASTPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL ' 
RTRDRGWPSPRPBREKRTSQSARRPTCTESRWJCSEEEVESDDEY 
LALPARIiTQVSSLVS YLGS IS TI/VTLPTGDI KGQS PLEVSDSDG 
PASPPSSSSQSQLPPGAAIiQGSGDPEGQNPCFLRSFVRAHDSAG 
EGSLGSSQALGVSSGLIiKTRPSLPARLDRWPFSDPDVEGQIjPRK 

ggeqgkeslvqc\vktfc\cqleelicwlynv\advtdhgtpar 
snltslknsslqlyrqfkkdidehqsltesvlqkgeillqclle 
ntpvledvlgriakqsgeleshadrlyds ilasldmlagctli p 

DKKPMAAMEHPCEGV 


5425 


1086 


115 


GFCPSPSLGHQPPRVLHPT.MSMAVETFGFFMATVGLI.MLGVTLP" 
NS Y W R VST VHGNV I TTNT I FENI»W FS CATDS LGVYNCWE FPSMI* 
ALSGYIQACRALMITAlLLGFLGLIiLGIAGLRCTNIGGIiELSRK 
AKLAATAGAPHV ILPGICGMVAI \SWYAFNITR\DFSDPLYPGT 
KYELGPALYLGWSASLISILGGLCLCSACCCGSDEDPAASARRP 
YQ AP VS VMP VATS DQ EGDS S FGKYGRNALR VAALCRGPRCli PTA 
PKKRGPGRGPFPYSNLRGRPRPVPVAPPRPRPRVLHSHGPSQAK 
NCSWEVAYliPSEAGSIil F 


5426 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQP5APSDPTDQP 
PAAHAKPDPGSGGQPAGPGAAGEALAVLTSFGRRLLVLIPVYLA 
GAVGLSVGFVLFGIALYLGWRRVRDEKEIKSLRAARQLLDDEEQL 
TAKTL YMSHRELPAWVS FPDVE KAEWLNKI VAQVWPFLGQ YMEK 
LLAET VAPAVRG SNPHLQT FTFTRVELGE KPLR 1 1 GV K VHPGQR 
KEQ I LLDLNI S Y VGDVQ I DVEVKKYFCKAG VKGMQ LHGVLR VII* 
EPL IGDLPFVGAVSMFFI RRPTLD INWTGMTNLLDIPGLSSLSD 
TM I MDS I AAFTiVl»PNRI»L VPLVPDLQDVAOLRS PLPRGI I R I HL 
LAARGLSSKDKYVKGLIEGKSDPYA1.VRLGTQTFCSRV1DEELN 
PQWGET YEVMVHE VPGQE I EVEVFDKDPDKDDFLGRMKL.DVGKV 
IK3AS VLDDWFPLQGGQGQ VHIjR LEW hS LxJjSDAEKZjEQVZiQWNVJG 
VSS RPDP PSAAI L WYLDRAQDLPMVTS EL YP PQL KKGNKE P NP 
MVQLS IQDVTQES KAVYSTNCPVWE EAFRFFliQDPQSQELDVQV 
KDDSRALTIiGALTI>PI*ARLLTAPEL1LDQWFQI,SSSGPNSRIjYM 

klvmr i h yldsse i cfptvpgcpgawdvdsenpqrgss vpappr 
pchttpdsqfgtehvlrihvleaqdliakdrflgglvkgksdpy 
vklklagrsfrshvvredlnprwnevfevivtsvpgqeiievevf 
dkdldio>dflgrckvrittvi,nsgfi»dewltledvpsgrlihlrt» 

ERLTPRPTAABLEEVLQ VNSIjIOTQKSAELAAALLS I YMERAED 
LPLRKGTKHLS P YATLTVGDS S HKTKTI SQTS AP VWDES AS FLI 
RKPKTE S LELQVRGEGTG VLGSLSL PLS ELLVADQLCLDRW FTL 
SSGQGQVLLRAQLGinvSQHSGVEAHSHSYSHSSSSLSEEPELS 
GGP PH I TS S APE V \ RQRI*THVDSPLEAPAGPLGQVKI*TLWY YSE 
ERKLVS X VHGCRSLRQNGRDP PDPYVSLLLLPDKNRGTKRRTSQ 
KKRTLS PE FNERFEWE LPLDEAQRRKLDVS VKSNS S FMSREREL 
LGKVQLDLAETDIiS QG VARW YDLMDN KDKGSS 


5427 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP -- 

PAAHAKPDPGSGGQPAGPGAAGEAIAVLTSFGRRLLVLIPVYLA 

GAVGIiS VG F VLFGLALYXGWRRVRDEKERSLRAARQLLDDEEQLi 

TAKTJLYMSHREI>PAWVSFPDVEKAEWLNKXVAQVWPFLGQYMEX 

LLAETVAPAVRGSNPHLQTFTFTRVELGEKPLRIIGVKVHPGQR 

KEQILLDLN I S YVGDVQID VE VKKY FCKAGVKGMQLHG VLRVI L 

EPI»IGDLPFVGAVSMFFIRRPTIJ5INWTGMTNLLDIPGLSSLSD 

TMIMDS I AAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGI IR1HL 

IAARGLSSKDKYVKGIjIEGKSDPYALVRLGTQTFCSRVIDEEIjN 

PQWGETYEVMVHEVPGQEIEVEVFDKDPDKDDFLGRMKLDVGKV 

IiQASVI»DDWFPL0«^QVHLRLEWLSI>LSDAEKI,EQVIX2WN^JG 

VSSRPDPPSAAILVVYIJ5RAQDI,PMVTSEI>YPP0I»KKGNKEPNP 

MVQLS IQDVTQES KAVYSTNCPVWEEAFRFFLQDPQSQELDVQV 

KDDSRALTi^ALTLPLARLLTAPELILDQWFQLSSSGPNSRLYM 

KLVMRILYLDSSEICFPTVPGCPGAWDVDSENPQRGSSVDAPPR 

PCHTTPDSQFGTEHVXiR IHVLEAQDL I AKDRPLGGLVKGKSDPY 

VKLKLAGRS FRSHWREDLNPRWNEVFEVIVTSVPGQELEVEVF 

DKDI^KDDFLGRCKVRI*TTVIiNSGFIjDEWLTLEDVPSGRIjHLRL 
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Predicted end 
nucleotide 
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Amino acid segment containing signal peptide 
{A^Alanine, C=Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=» Lysine, 
L=Leucine, M-Methicnine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S»Serine, T=Threonine, V- Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








E RLTPRPTAAELEE VLQ VNSL IQTQKSAELAAALLS I YME RAED 
LPLRKGTKHLS PYATLTVGDS SH KTKT I SOTS APVWDES AS FL I 
RKPHTESLELQVKGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 
SSGQGQVLLRAQLGILVSQHSGV^AHSHSYSHSSSSLSEEPELS 
GGP P HITSS AP E V \ RQRLTHVDS PLE AP AG PLGQ VKLTLW YYS E 
ERKLVS I VHG CRSLRQNGRDPPDP Y VSLLLLPDKNRGTKRRTS Q 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 
IX3KVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5428 


3 


1839 


SSRSERLSACAIAPPWLVSSRPARPAQLQRPGKMVEDGABELED 
LVHFSVSELPSRGYGVMEEIRRQGKLCDVTLKIGDHKFSAHRIV 
LAASI PYFHAMFTNDMMECKQDEIVMQGMDPSALEALINFAYNG 
NLAI DQQNVQS LLMGAS FZjQLQS I KDACCTFLRERLHP KNCLGV 
RQ FAETMMCAVLY DAANS FIHQH FVE VSMSEEFLAL PLEDVLEL 

vsrdelnvkseeqvfeaalawvrydreqrgtplXrnlqsnirll 

FCRPQFLSDRVQQDDLVRCCHKCRDLVDEAKDYLLMPERRPHIjP 
AFRTR PRCCTS I AGLI YAVGGLNSAGDSLNWEVFDP 1 ANCWER 
CRPMTTAR SR VG VAWNGLL YAI GG YDGQLRLSTVQAYNTETDT 
WTRVGSMNSKRSAMGTVVLDGQIYVCGGYDGNSSLSSVETYSPE 
TDKWTVVTSMSSNRSAAXGVTVFEGRIYVSGGHDGLQIFSSVEH 

ynhhtatwhpaagmlnkrcrhgaaslgskmfvcggydgsgflsi 
aemyss v\adqwcli vpm\htrr \ srvslgg pavgrlyavwg vt 

TGQSNL\ SS VGDVLTPETDCWTFM \ apmacheggvgvg c I PLLT 

I 


5429 


828 


202 


RREDALSSEGCLWPSESTVSGNGIPEPQVYAPPRPTDRLAVPPF 
AQRERFHRFQPTYPYLQHEIDLPPTISLSDGEEPPPYQGPCTLQ 
LRDP EQQLELNRE S VRAP PNRTI FDSDLMDSARLGGPCP PS SNS 

GI SATCYGSGGRM EGP p p \ tys evigh y PGS S fqhqqssg pps l 

LEGTRLHHTH IAPLESAAI WSKEKDKQKGHPL 


5430 


441 


1507 


QKRRKRRRKKI MKT IQPKMHNS ISWAI FTGLAALCLFQGVPVRS ~ 
GDATPPKAMDNVTVRQGESATLRCTIDNRVTRVAWLNRSTILYA 
GNDKWCLDPRVVLLSNTQ TQ YS I BIQNVDVYDEG P YTCS VQTDIT 
HPKTSRVHLIVQVSPKIVEISSDISINEGNNISLTCIATGRPEP 
TVTWRH I S PKAVGFVSEDE YLE I QG ITREQSGDYE CS ASNDV\A 
APV\VRRVKVTVNYPPYISEAKGTGVPVGQKGTLQCEASAVPSA 
EFQW Y KDD KRL I / EGKKG VKVENRP FLS KLI FFNVS EHDYGN YT 
CVASNKLGHTNAS I MLFGPGAVS EVSNGT S RRAG C VWLLPLLVL 
HLLLKF 


5431 


2 


1312 


AAAAPGSRRRRPLPDRPHMAHGYEAPPPPAPRSPAWRARSKPV\ 
LPGITINP\TIAEGPSP\TSEGASEANLVDLQKKLEEbELDEQQ 
KKRIjEAFIiTQKAKVGELKDDDFERISBLGAGNGGWTKVQHRPS 
GLIMARKLIHLEIKPAIRNQIIREIiQVLHECNSPYIVGFYGAFY 
SDGEI S I CMEHKDGGSLDQVLKEAKRI PEE I LGKVS I AVLRG LA 
YLREKHQ IMHRDVKPSNILVNSRGEIKLCDFG VSGQL I DSMANS 
F VGTRS YMAPERLQGTHYSVQS D I WSKGLSLVELAVGR YPIPPP 
DAKELEAI FGRPWDGEEGEPHS ISPRPRPPGRPVSGHGMDSRP 
AMAI FELLD YI VNE P P P KLPNGV FTPDFQEFVNKCL I KNPAERA 
PLKMLTHHTFIKRSEVEEVDFAGWLCKTLRIiNQPGTPTRTAV 




2 


1312 


AAAAPGSRRRRPLPDRPHMAHGYEAPPPPAPRSPAWRARSKPVV" 
LPGITINP\TIAEGPSP\TSEGASEANLVDLQKKLBELELDEQO 
K KRL EA FLTQKAX VGE L KDDD FER I S ELGAGNGGWTKVQHRP S 
1 NAKKJj I HLEI KPAIRNQ I IRELQVLHECNS P Y I VGFYGAFY 
SDGEISICMEHMDGGSLDQVLKEAKRIPEEILGKVSIAVLRGLA 
YLREKHQ IMHRDVKPSNILVNSRG E I KLCDFG VSGQL I DSMANS 
FVGTRSYMAPERliQGTHYSVQSDIWSMGLSLVELAVGRYPIPPP 
DAKELEAI FGRPWDGEEGEPHS ISPRPRPPGRPVSGHGMDSRP 
AMAI FELLDYIVNEP PPKLPNGVFTPDFQEFVNKCLI KNPAERA 
DLKMLTNHTFI XRSEVEEVDFAG WLCKTLRLNQPGTPTRTAV 


5433 


360 


1885 


SVQEDKVGFEDPLHLCSWRARACPCTWPHC/CTGLLECLGFAGV 
LFGWPSLVFVFKNEDYFKDLOGPDAGPIGNATGQADCKAQDERF 
SLI FTLGSFMNNFMTFPTGYI FDRFKTTVARL I A I FF YTTATLI 



320 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(A^Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*» Phenyl alanine, G=Glycine, 
H«Histidine, I = Isoleucine, K= Lysine, 
L= Leu cine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R^Arginine, 
S« Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknoi^, *=Stop 

Co don . /=DOSSihlp nnrl A^h^ <4o ... i_ ^ 

» / -p'-ooAWAC UUClcOulQc QcicClOn , 

\=possible nucleotide insertion) 








I AFTSAGSAVLLPIiAMPMLTIGG I bFLITNLQIGNL FGQHRSTI 
ITLYNGAFDS SSAVFLI I KLLYEKGI SLR/ VLLHLHLCLQ YLAC 
STHFPPDAPGAHPI PTAPQLQLWP VPWEWHHKGR EG /QQLSMKT 

v '*-' i'JU^or W^r^p*.Kt'\jL»v«jKoKWi>>Vlr'i>ljAIi>/ Q-SRKFAWHLVWL 

S V I QLWH Y LFlGTIjNSLLTNMAGGDMARVSTYTNAFAFTQFG VL 
CAPWNGLLMDRLKQKYQKEARKTGS 5 TLAVALCS TVPSLALTSL 
LCLGFALCASVPILPLQYLTFILQVISRSFLYGSNAAFLTLAFP 
SEHFGKLFGLVMALSAWSLLQFP I FTLI KGSLQNDP F YVNVMF 
MLAILLTFFHPFLVYRECRTWKESPSAIA 


5434 


66 


652 


RYAALIISLIQHKLLWRNQHCSRCVIMSPAQSAGLNWLF/'GSGK ' 
HGPFLGCSQYPACDYVRPLKSSADGHIVKVLEGQVCPACGANLV 
LRQGRFGMF I G CINYPECEHTELI DKPDETA I TCPQCRTGHLVQ 
RRSRYGKTFHSCDRYPECQFAINFKPIAGBCPECHYPLLIEKKT 
AQGVKHFCASKQCGKPVSAE 


5435 


4704 


1597 


I PGDSSQRIiAEMSNAKERKHAKKMRNQPTNVTLSSGFVADRGVKH 
HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNEQSSSK 
GMFRKKGGWKAGPEGTSQEI PKY I TASTFAQARAAE I SAMLKAV 
TQKSSNSLVFQTLPRHMRRRAMSHNVKRLPRRLQEIAQKEAEKA 
VHQKKEHSKNKCHKARRCHMNRTLEFNRRQKKNIWLETHIW3IAK 
RFHMVKKWGYCLGERPTVKSHRACYRAMTNRCLLQDLSYYCCLB 
LKGKEEEI LKALSGMCN IDTGLTFAAVHCLSGKRQGSLVLYRVN 
KYPREMLGPVTFIWKSQRTPGDPSESRQLWIWLHPTLKQDILEE 
I KAACQCVEPI KSAVCIADPLPTPSQEXSQTELPDEKIGKKRKR 
KDDGENAKP I KfCI IGDGTRDPCLP YS W IS PTTG III SDLTMEMN 
RFRLIGPLSHSILTEAIKAASVHTVGEDTEETPHRWWIETCKKP 
DSVS LHCRQEA I FELLLGG I TSPAEI PAGTILGLTVGDPRINLPQ 
KKS KALPNP EKCQDNEKVRQLLLEG VP VECTHS F I WMQD I CKS V 
TBNKISDQDLNRMRSELLVPGSQLI LGPHESKI PILLIQQPGKV 
TG EDRLGWGS G WD VLLPKG WGMAFW I PFIYRGVRVGGLKESAVH 
SQYKRSPNVPGDFPDCPAGMLFAEEQAKNLLEKYKRRPPAKRPN 
YVKLGTLAPFCCPWEQLTQDWESRVQAYEEPSVASSPNGKESDL 
RRS EVPCAPM PKKTHQPS DEVGTS I EHPREAEEVMEfcAGCQESAG 
PERITDQEASENHVAATGSHLCVLRSRKLLKQLSAWCGPSSEDS 
RGGRRAPGRGQQGLTREACLS ILGHFPRALVWVSLSLLSKGS PE 
PHTMI CVPAKEDFLQLHEDWH YCGPQES KHSDPFRSKI LKQKEK 
KKREKRQKP\GRASSDGPAGEEPVAGQEALTLGLWSGPLPRVTL 
"^>k x ujjujt v i V«i^l*^MAVGCXjEAI^FVSLTGLLDMLSSQPAAQ 
RGLVLLRPPASLQYRFARIAIEV 


5436 


1781 


635 


ASDS I PWSEARTTRKLAQRGCQWSLPERMPLWFCGLP YSGKSR 
RAEELR VALAAEGRAVYVVDDAAVLGAEDPAVYGDSAREKALRG 
ALRASVERRLSRHDWILDSL>JYIKGFRYELY\CLARAARTPLC 
LVYCVRPGGP I AGPQ VAGANENPGRNVS VS WRPRAE EDGRAQAA 
GSS VLRELHTADS WNGSAQADVPKEIiEREESGAAES PALVTPD 
SEKSAKHGSGAFYSPELLEALTLRFEAPDSRIsrRWDRPLFTLVGL 

QVLAGLMEAQKSAVPGDIiLTL PGTTEHLR FTR PLTMAELSRLRR 
QFISYTKMHPNNBNLPQLANMFLQYLSQSLH 


5437 
5438" ■ 


739 


1672 


CQEAASEFGGPLHTPAMFLRRLGGWLPRPWGRRKPMRPDPPYPE 
PRRVDSSSENSGSDWDSAPETMEDVGHPKTKDSGALRVSRAASE 
PSKEEPQVEQLGSKRMDSLKWDQPISSTQESGRLEAGGASPKLR 
WDHVDSGGTRRPGVS PEGGL \GVPGPGAPLEKPGRREKLLGWLR 
GEPGAPSRYIiGGPEECXQIST^TIJ^LELIJ^AIjIiAJ^CSRPLR 
AALDTLGLRGPLGLWLHGLLSFLAALHGLHAVLSLLTAHPLHFA 
CLFG5LLQALVLAVSLREPNGDEAATDWESEGLEREGEEQRGDPG 
KGL 




2443 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
IiAPPSLRRPMMCQSEARQGPELRAAKWLHFPQLALRRRLGQLSC 
MSRPAIrfCLRSWPLTVLYYXLPFGALRPLSRVGWRPVSRVALYKS 
VPTRLIiSRAWGRLNQVELPHWLRRPVYSLYIWTFGVNMKEAAVE 
DLHKYRNLSEFFRRKLKPQARP VCGLHS VISPSDGRI LNFGQVK 
NCEVEQVKGVTYSLES FLGPRMCTEDLPFP PAASGDS FKNQLVT 
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ID 
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beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
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location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H^Histidine , I=Isoleucine, K=I>ysine, 
L=L>eucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine / X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








REGNBLYHCVIYLAPGDYHCFHSPTDWTVSHRRHFPGSIiMSVNP 
GMAR WIKELFCHNER WLTGDWKHG FFS LTAVGAT\NWGS IRI Y 
FDRDLHTKSPRHSKGSYNI)FSFVTHTNREGVPMALRGEHLG/OS 
FNiiGoTI V Itl FEAPKDr NF QIjKTGQKIRFGEAIjGSLi 


5439 


2443 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
LAPPSLRRPMMCQSEARQGPELRAAKWLHFPQLAIiRRRLGQLSC 
MSRPALKLRSWPLTVLYYLLPPGALRPLSRVGWRPVSRVAIjYKS 
vptrllsrawgrlnqvelphwlrrpvyslyiwtfgvnmkeaave 
DLHHYRNLSEFFRRKLKPQARPVCGLHSVISPSDGRILNFGQVK 
NCEVEQVKGVTYSLESFLGPRMCTEDLPFPPAASCDSFKNQLVT 
REGNEIiYHCVIYLAPGDYHCFHSPTDWTVSHRRHFPGS3bMSVlvTP 
GMARW I KEIiFCHNERVVXiTGDWKHGF FS LTAVGAT \NWGS I R I Y 
FDRDLHTNSPRHSKGSYNDFSFVTHTNREGVPMALRGEHLG/QS 
FNLGSTIVLI FEAPKDFNFQLKTGQKIRFGEALGSL 


5440 


693 


253 


EP I PVTPDHRLVTMTHIV \QTFSPVNS \GQPPNYEMI*KEEQEVA 
MLGAPHNPAPPMSTVIHIRSETSVPDHVVWSLFKTLFMNTCCLG 
FIAFAYS VKSRDRKMVGDVTGAQAYASTAKCLNI WAIiI LGI FMT 
ILLI 1 1 PVLWQAQR 


5441 


2 


2054 


CRDGGKNGFMVS PMKPLE I KTQCSGPRMDPKICPADPAFFS FIN 
NSDLVTVANIETGEERRLTFCHQGLiSNVLDDPKSAGVATFVIQEE 
FDRFTGYWWCPTASWEGSEGLKTIjRILYEEVDESEVEVIHVPSP 
ALEERKTDSYRYPRTGSKNPKIALKIiAEFQTDSQGKIVSTQEKE 
LVQPFSSIiFPKVEYIARAGWTRDGKYAVIAMFIjDRPQQWLQLVIilj 
PPAIiFIPSTENEEQ\RLASARAVPRI*VQPYVVYEEVTNVWINVH 
DIFYPFPQSEGEDELCFLRANECKTGFCHLYKVTAVLKSQGYDW 
S EPFS PGEGEQS I*TNAIWVN EETKL»VY FQGTKDTPLEHHL YWS 
YE AAG E I VRLTTPG FS HS CSMSQNFDMFVSH YSS VSTPPCVHVY 
KLSGPDDD PLHKQPRFWASMMEAAKI FHFHTRSDVRIiYGM I Y KP 
HALQPGKKHPT VLFVYGGPQVQLVNNS FKG I KYLRLNTIASLGY 
AVWI DGRGSCQRGLR FEGALKNQMGQ VE I EDQ VEGLQ FVAE KY 
GFIDLSRVAIHGWSYGGFLSLMGLIHKPQVFKVAIAGAPVTVWM 
AYDTGYTERYMDVPENNQHGYEAGSVAIjHVEKLiPNEPNRLLIIiH 
GFLDENVHFFHTNFLVSOIj I RAGK P YQLQVALPPVS PQI YPNER 
HS IRCPESGEHYEVTLLHFLQEYL 


5442 


1 


3474 


CGQRSRRRS PDMPEAKPAAKKAPKG KDAPKGAPKEAPPKEAPAE 
APKEAPPEDQSPTAEEPTGVFLKKPDSV3VETGKDAVWAKVNG 
KELiPDKPT I KWFKGKWLEI/3SKSGARFSFKESHNSASNVYTVEI, 
HIGKWLGDRGYYRLEVKAKDTCDSCGFNIDVEAPRQDASGQSL 
E S FKRTS EKKSDTAGEliDFSGIiIiKKREWEEE KKKKKKDDDDLG 
I P PE I WELLKGAKKS E Y EKIAFQYG I TDLRGMI>KRLKKAKVE VK 
KS AAF^*KK1*DPAYQVDRGNKIKI*MVS I SDPDLTIjKWFKNGQEIK 
PSSKYVFEN VGKKRILT INKCTLADDAAYEVAVKDEKCFTELFV 
KEPP VL I VTPLEDQQVFVGDRVEMAVEVSEEGAQVMWMKDGVEIi 
TREDSF KAR YRFKKDGKRHILI FSDWQBDRGRYQVITNGGQCE 
AELIVEEKQIjEVIjQDIADLTVKASEQAVFKCEVSDEKVTGKWYK 
NGVEVRPSKRITISHVGRFHKLVIDDVRPEDEGDYTFVPDGYAL 
GSLS AKUtf FLEIKVEYVPKQ\EPP KI PLGFASGGKTSENAD/ IV 
WAGNKLiRLiDV\S ITG EAPSPFAT\ WLKG \DEVFTTTEGRTRIE 
KR VDCSS F VI ESAQREDEGRYTI KVTNPIGBDVAS I FLQ WDVP 
DP PEAVRI TS VGEDW AI LVWEP PMYDGGKP VTG YI>VERKKKGSQ 

TKPFMPIAPTS3PLHLIVEDVTDTTTTLPCWRPPNRIGAGGIDGY 
LVEYCLEGS EE W VPANTE PVERCGFTVKMIiPTGARI LFRWGVN 
IAGRSEPATLAQPVTIRE I AEPPKI Ri»PRHI*RQTYIRKVGEQI»N 
LWPFQGKPRPQVVl^KGGAPIJDTSRVHVRTSDFDTVFFVRQAA 
RSDSGE YEI»S VQI ENMKDTATIR I RWEKAGP P IWVMVKE VWGT 
NAL VE WQAPKDDGNSE I MG Y FVQKADKKTMEW FNV YERNRHTS C 
T VSDL I VGNE YY FRVYTENI CGLS DS PGVS KNTAR I LKTG I TFK 
PPEYKEHDFRMAPKFLTPLIDRWVAGYSAAIiNCAVRGHPKPKV 
VWMKNKME I REDPKFL I TNYQG VLTLWIRRPS PFDAG TYTCRAV 
NELGEALAECKLEVRVPQ 
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SEQ 
ID 
NO: 


Predicted, 
beginning 
nucleotide 
location 
corre spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine f N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W-Tryptophan, Y=* Tyrosine, X=UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5443 


66 


1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSPRRSRSAAEPA 
MALS M PLNGLK EEDKE PLI ELF VKAGSDGES IGNCP FSQRLFMI 
LWLKGWFSVTTVDLKRKPADLQNLAPGTHPPFITFNSEVKTDV 
NKI EEFLEEVLCPP KYLKLS PKHPESNTAGMDI FAKFS AYIKNS 
RP EANEALERG LLKTLQKLDE YLNS PLPDE I DENSMED I KFSTR 
KFLDGNEMTLADCNLLPKIjHIVKWAKKYRNFDIPKEMTGIWRy 
LTNAYSRDEFTNTCPSDKEVEI\AYSDVAKRLHQVKSRLLKEVS 
FMSSP 


5444 


2 


344 


SGPIGVTGAQMAKWLRDYLSFGGRRPPPQPPTPDYTESDIDRAY 
RAQKNLJDFEDP Y*DSESRLEPDPAGPGDS KNPGDAKYG SPKHRIi 
I KVEAADMARAKALLGGPGEELEADTEYLDPFDAQPHPAP PDDG 
YME P YDAQWVMS ELPGRGVQLYDTP YEEQDPETADGPPSGQKPR 
QSRMPQEDER PADE YDQP WEWKKDHI SRAFAVQFDS PE WKRTPG 
SAKELRRPPPRSPQPABRVDPALPLEKQPWFHGPLNRADAESLL 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFliHLKFARTRENQW 
LGQHSGPFPS VPELVLKYSSRPL PVQGAEHLALLYP WTQTP* Q 
* PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGIjHRERHPEGLP 
RAEKPGLRG PLLGLR E PLGAG PRG P WGLQE PRRCQVWFSQAPAH 
QGGGCGYGQSQGPSGRPRGGAGSRH 


5445 


2364 


486 


1LSRGFLGS VE I CIQLPLPASEP VLLIjTWARRRWRETRSRREPT 
TLRAQSVCPWWI *ETRMNRSIPVEVDESEPYPSQLLKPI PEYSP 
EEESEPPAPNIRNMAPNSLSAPTMLHNSSGDFSQAHSTLKIANH 
QRPVSRQVTCLRTQVLEDSEDSFCRRHPGLGKAFPSGCSAVSEP 
ASESWGALPAEHOFSFMEKRNOW7.VQnT.QAa Q DnrrHnon^on 

QSLPNASADSLGGSQEMVQRPQPHRNRAGLDLPTIDTGYDSQPQ 
DVLG IRQLER PLPLTS VCYPQDLPRPLRSREFPQFEPQRYPACA 
QMLP PNTjSPHAPWNYH YHCPGS PDHQVPYGHD YPRAAYQQVIQP 
ALPGQPLPGASVRGLHPVQKVILIfYPSPWDQEERPAQRDCSFPG 
LPRHQDQPHHQP PNRAGAPGESLECPABLRPQVPQPPS PAAVPR 
PPSNPPARGTLKTSNLPEELRKVFITYSMDTAME\A/KFVNFLLV 
NGFQTA ID I FEDR IRG I DI I KWMER YLRD KTVM I I VAIS PK YKQ 
DVEGAESQLDEDEHGLHTKYIHRMM0IEF1 KQGSMNFRFI P VLF 
PNAKKEHVPTWLQNTHVYS WPKNKKN I LLRLLREEEYVAPPRGP 
LPTLQWPL 


5446 


972 


161 


SS WS WCTGRMR KTRLWGLL WMLF VS ELRAATKLTEEKYELKEGQ 
TLDVKCD YTLEKFASSQKAWQI IRDGEMPKTLACTERPS KNSHP 
VQVGRI I LEDYHDHGLLRVRMVNLQVEDSGLYQCVIYQP PKEPH 
MLFDRIRLWTKGFSGTPGSNENSTQNVYKI PPTTTKALCPLYT 
TPRTVTQAPPKS TADVS TPDS EINLTNVTDI I R VP VFNI VI LLA 

GGFLSKSLVFSVLFAVTLRSFVP*AHEPTRMSSDFQPHPSGSCA 
KGGGRR 


5447 


207 


617 ; 


MTARTLS LMAS L VA YDDS DS EAETEHAGSFNATGQQKDT S GVAR 
PPGODFASGTLDVPKAGAQPTKHGSCEDPGGYRLPLAQLGRSDR 
GSCPSQRI^WPGKEPQVI^PIKEPSCSSLWTSHVPASHMPLAAA 
RFKQVKLSRNFPKSSFHAQSESETVGKNGSS FQKKKCEDCWPY 
TPRRLRQRQALS TEItSKGKD VEPQGP PAGRAPAPL YVG PGVS EF 
IQPYLWSHYKETTVPRKVLFHLRGHRGPVNTIQWCPVLSKSHML 
LSTSMDKTF KVWNAVDSGHCLQTYS LHTEAVRAARWAP CGRRI L 
SGG FD FALHLTDLETGTQL FSGRSD FRITTLKFHPXDHN I FLCG 
GFSSEMKAWDIRTGKVWRSYKATIQQTLDILFLREGSEFLSSTD 
ASTRDSADRT1 1 AWDFRTSAKISNQI FHERFTCPSIiALHPREPV 
FIAQTNGNYLALFSTVWPYRMSRRRRYEGHKVEGYSVGCECSPG 
GDLLVTG S ADGRVLM YSFRTAS RACTLQGHTQAC VGTTYH PVLP 
S VLATCS WGGDMKI WH *AFHWLSLGEAIGDLAPARG YSGPGRS L 
KSPSPS KSLLVLLCGRAMFQ PATCP WQL PALS K 


5448 


194 


1833 


MASKVTDA1VWYQKKIGAYDQQI WEKSVEQREI KGLRNKPKXTA 
HVKPDLIDVDLVRGSAFAKAKPESPWTSLTTKGIVRWFFPFFF 
RWWLQVTS KVI FFWLL VLYLLQ VAA I VL FCSTS S PHS I PLTEVI 
GPIWMLLLGTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGS STTDNTQEGAVQNHGTSTS HS VGTVFRDLWHAAFFLS 
GSKKAKNS IDKS TETDNG YVS LDGKKT VKSGEDG IQNHEPQCET 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A~Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S-Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTGTLRNGPSKDTQRTITNVSDEVSSEEGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPESETEDVLWEDLLHCAECHSSCTSETDVENHQINPC 
VKKEYRDDPFHQSHLPWLHSSHPGLEKISAIVWEGNDCKKADMS 
VLE I SGMI MNR VNSH I PGIG YQI FGUAVSL I LGLTPF VFRLSQA 
TDLEQLTAHSASELYVI AFGSNEDVI VLSMVI I SFWRVSLVWI 
FFFLLCVAERT YKQVGI M * TSEGVLRNRKSHHYKKHYPNEDAPK 
SGTSCSSRC5SSRQDSESARPESETEDVLWEDLLHCAECHSSCT 
SETDVENHQINPC^KKEYRDDPFHQSHLPWLHSSHPGLEKISAI 
VWEGNDCKKADMSVLEISGMIMNRVNSHIPGIGYQIFGNAVSLI 
LGLTP FVFRIiS QATDLEQLTAHSAS EL Y V I AFGSNEDVI VLSMV 
1 1 S FWRVS LVW I FFFLLCVAER TYKQVG IM 


5449 


194 


1833 


MAS KVTDAI VW YQ K.K.I G A YDQQ I W E KS VEQRE I KGLRN KP KKTA 
H VKPDL I DVDLVRGS AFAKAXPES PWTS LTTKG I VRWF F PF FF 
RWWLQVTS KVI FFWLLVL YL^QVAAI VLFCSTSS PHSIPLTEVI 
GPIWLMLLLGTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGSSTTD>ITOBGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GS KKAKNS IDKSTETDNG YVSLDGKKT VKSGEDG I QNHEPQCST 
I RPEETA WNTG TLRNGPS KDTQRT I TNVSDEVS S EEGPETG YS L 
RRHVDRTS EGVLRNRKSHH Y K KHY PN EDAPKSGTS CSSRCS S S R 
QDSESARPESETEDVLWEDLLHCAECHSSCTSETDVENHQINPC 
VKKEYRDDPFHQSHLPWLHSSHPGLEKISAIVWEGNDCKKADMS 
VLEISGMIMNRVNSHI PGIG YQI FGNAVSLILGLTPFVFRLSQA 
TDLEQLTAHSAS ELYVI AFGSNEDVI VLSMVI I S FWRVSLVWI 
FPFLLCVAERTYKQVGIM * TSEGVLRNRKSHHYKKHYPNEDA? K 
SGTSCSSRCSSSRQDSESARPESBTEDVLWEDLLHCAECHSSCT 
S ETD VENHQ IN PCVKKE YRDDP FHQSHLPWLHS S HPGLEKI SAI 
VWEGNDCKKADMSVLEISGMIMNRVNSHIPGIGYQIB-GNAVSLI 
LGLTPFVFRLSQATDLEQLTAHSASELYVIAFGSNSDVIVLSMV 
I ISFWR VSLVWIFFFLLCVAERTYKQVGIM 


5450 


8136 


1242 


GQQFAS FFG* NHPEVTVAMALTDIDLOJLQFSMSQPEALLLLAAG 
PADHLLLQLYSGHLQVRLVLGQEELRLQTPAETLLSDS I PHTW 
LTVVEG WATLS VDG FLNAS SAVPGAPLE VP YGLF VGGTGTLGL P 
YLRGTSR PLRG CLHAATLNGR SLLRPLTPDVHEGCAEEFSASDD 
VALGFSG PHSLAAFPAWGTQDEG TLE FTLTTQSRQAPLAFQAGG 
RRGDP IYVDI FEGHLRA WE KGQGTVLLHNS VP VADGQPHE VS V 
HINAHRLEISVDQYPTHTSNRGVLSYLEPRGSLLLGGLDAEASR 
HLQEHRIiGLTPE ATNAS LLG CMEDLS VNGQRRGLRE ALLTRNMA 
AGCRLEEEE YEDDAYGHYEAFSTIiAPBAWPAMELPE PC VPE PGL 
PP VFANFTQLLT IS PLVVAEGGTAWLE WRHVQPTLDLMEAELR X 
SQVLFSVTRGAHYGELEIiDILGAQARKMFTLLDVVNRKARFIHD 
GSEDTSDQLVL.E VSVTARVPM P S CLRRGQTYLLP I QVNPVNDP ? 
HI I FFHGSLMVILEHTQKPLGPEVFQAYDPDSACEGLTFOVLGT 
SSGLP VERRDQPGEPATEFS CRELEAGSLVYVHCGG PAQDLTFR 
VSDGLQAS PPATLKWAI RPAIQIHRSTGLRLAQGSAMP ILPAN 
LSVETNAVGQDVSVLFRVTGALQFGELQKHSTGGVEGAEWWATQ 
AFHQRDVEQGRVRYLSTDPQHHAYDTVBNIALEVQVGQEILSNL 
SFPVTIQRATVWMLRLEPI.HTQNTQQETLTTAHLEATLEEAGPS 
PPTFHYEWQAPRKGNLQLQGTRLSDGQGFTQDDIQAGRVTYGA 
TARAS EAVEDTFRFRVTAPP Y FS PLYTFP IHIGGDPDAPVLTNV 
LLWPEGGEG VLSADHL FVKSLNSAS YLYE VMER PRLGRLAWRG 
TQDKTTMVTS FTNEDLLRGRLVYQHDDSETTEDD I PFVATRQGE 
SSGDMAWEEVRGVFRVAI QPVNDHAPVQT I SRI FHVARGGRRLL 
TTDDVAFS DADSGFADAQLVLTRKDLLFGS I VAVDEPTRPI YRF 
TOEDLRKRRVIiFVHSGADRGWIQLQVSDGQHQATALLEVQASEP 
YLRVANGS SLVVPQGGQGTIDTAVLHLDTNLDI RSGDEVHYHVT 
AGPRWGQLVRAGQPATAFSQQDLLDGAVLYSHNGSLS PEDTMAF 
S VEAGPVHTDATLQ VTI ALEGPLAPLKLVRHKKI YVFQGEAAEI 
RRDQLEAAQ EAVPPADI VFS VKS PPS AGYLVM VSRGALADEP PS 
LDPVQSFSQEAVDTGRVLYLHSRPEAWSDAFSbDVASGLGAPLE 
GVLVELEVLPAAI PLEAQNFS VP EGGSLTLAP PLLRVSGP YFPT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptiae 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N~Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S= Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLGLS LQVLEP PQHGPLQKEDGPQARTLS AFS WRMVEEQLI RYV 
HDGSETLTDS PVLMANAS EMDRQSHPVAFT VTVLP VNDQ P P I LT 
TNTGLQMWBGATAP I PAEALRSTDGDSGSEDLV YTIEQPSNGRV 
VLRGAPGTEVRS FTQAQLDGGLVLFSHRGTLDGGFPFRLSDGEH 
TSPGHFFRVTAQKQVLLSLKGSQTLTVCPGSVQPLSSQTLRASS 
SAGTDPQLLI.YRVVRGPQLGRLFHAQQDSTGEALVNFTQAEVYA 
GN I LYEHEMP PEPFWEAHDTLELQLSS PPARD VAATLAVAVS FE 
AACPQRPSHLWKNKGLWVPEGQRARXTVAALDASNLLAS VPS PQ 
RSEHDVLFQVTQFPSRGQLLVSEEPLHAGQPHFLQSQLAAGQLV 
YAHGGGGTQQDG FH FRAHLQGPAGAS VAG PQTSEAFAIT VRDVN 
ERPPQPQASVPLRLTRGSRAPISRAQLSWDPDSAPGEIEYEVQ 
RAPHNG FLS LVGGGLG P V^R FTOAnvn«;r;p T.a P*vziNir« c ctt-ii r* t t? 

QLSMSDGASPPLPMSItAVDILPSAIEVQLRAPLEVPQALGRSSL 
SQQQLRWSDREEPEAAYRMQGPQYGKLLVGGRPTSAFSQFQI 
DQGEWFAFTNFSSSHDH FRVHjAIjARGVNASAVVNVTVRALLHV 
WAGGPWPQGATLRLDPTVLDAGELANRTGSVPRFRLLEGPRHGR 
WRVP RARTE PGGSQIjVEQFTQQDLEDGRIiGLE VGRPEGRAPGP 
AGDS LTLELWAQGVP PAVAS LDFATEPYNAARPYS VALLS VP EA 
ARTEAG KPES S TPTGEPG PMASS PEPAVAKGGFLS FLEANMFS V 
IIPMCLVLLLIiAIiILPIiLFYLRKRNirrGKHDVQVliTAKPRNGLA 
GDTETFRKVEPGQAIPLTAVPGQGPPPGGQPDPELLQFCRTPNP 
ALKNGQYWV 


5451 


1 


2274 


RDS S EQGRTGDTLGRPSACMDALKPPCLWRNHERGKKDRDSCGR 
KNS EPGS PHSLEALRDAAPSQGl^NFLLLPTKMLFI FNFLFSPLP 
TPALICILTFGAAI FI>WLI TR PQP VLPLnLDLKNQS VGI EGGARK 
GVSQKNNDLTSCCFSDAKTMYEVFQRGLAVSDNGPCU3YRKPNQ 
PYRWLS YKQVSDRAE YLGSCLLHKGYKS SPDQFVG I FAQNRPSW 
I ISELAC YTYSMVAVPLYDTLGPEAI VH I VNKAD IAMVI CDTPQ 
KALVLIGNVEKGFTPSLKVIILMDPFDDDLKQRGEKSGIEILSL 

I VSNAAAFLKC VEHAYE PTPDDVAIS YIiPLAHM FER I VQAWYS 
CGARVG FFQGDIRLLADDMKTLiKPTLFPAVPRLLNRIYDKVQNE 
AKTPLKJCFLIiKIiAVSSKFKEIiQKGIIRIiDSFWDKLIFAKlQDSIj 
GGRVRVI\nX2AAPMSTSVMTFFRAAMGCQVYEAYGQTECTGGCT 
FTLPGDWTSGHVGVPI^CNYVKIjEDVADMrnfFTVNNEGEVCI KG 
UTA/FKGYLKDPEKTQEALDSDGWI*HTGDIGRWLPNGTL5CIIDRK 
KN I FKLAQGE Y I APEKI EN I YNRSQP VLQ I FVHGES LRSS I»VGV 
WPDTD VLPS FAAKLG VKGS FEEI.CQNQWREAI LEDLQKIGKE 
SGLKTFEQVKAIFLHPEPFSIEKGLLTPTLKAKRGELSKYFRTQ 
IDSLYEHIQD 


S452 


1833 


113 8 


SRVPSLCLSI*SI,SIjSPSREPVAGAPGCGTAGPPAiyiATLWGGI>LR 
LGSLLSIiSCLALSVLIjIAQLSDAAKNFEDVRCKCICPPYKENSG 
HI YNKN I SQKDCDChHWE PMPVRGPDVEAYCIiRCECKYEERSS 
VTIKVTI 1 1 YXjS I1GLLLLYMV YLTLVEPILKRRIrFGHAQLI QS 
DDDIGDHQPFANAHDVLARSRSRANVLWKVEYAQQRWKIjQVQEQ 
RKSVFDRHWLS 


5453 


111 


1520 


PS I PAAVPQSAPPE PHREETVTATATS Q VAQQPPAAAAPGEQAV ~ 
AGPAPSTVPSSTSKDR P VSQPS LVGSKEEPPPARSGSGGGSAKE 
PQEERSQQQDDIEELETKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KGLDTETT VEVAWCELQDR KLTKS ERQRFKEEAEMLKGLQHPN I 
VRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKVMKIKVI, 
RSWCRQII»JCGLQFI*HTRTPP I IHRDLKCnwiFrTGPTGS VKIGD 
LGLATLKRAS FAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCM 
liEMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVAI PEVKEI I 
EGCIRQNKDERYSIKDIlL^7HAFFQEETGVRVEIIABEDDGEKIAI 
KiWLRIEDIKKLKGKYKDNEAIEFSFDIiERNVPEDVAQEMVESG 
YVCEGDHKTMAKAI KDRVSLI KRKREQRQL* 


5454 


111 


1520 


PS I PAAVPQSAPPE PHREETVTATATSQVAQQP PAAAAPGEQAV 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 
PQEERSQQQDDIEELETKAVGMSNDGRFLKFDI EIGRGS FKTVY 
KGLDTETTVEVAWCBliQDRKLTKS ERQRFKEEAEMLKGLQHPNI 
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ID 
NO: 


Predicted 
beginning 
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location 
corre spondi ng 
to first 
amir.o acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lyaine, 
L^Leucine, M=Methionine , NfcAsparagine, 
P=»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Un known, *=Stop 
Codon, /=possible nucleotide deletion. 
X^possible nucleotide insertion) 








VRFYDSWESTVKGKKCIVL4VTEI*MTSGTLKTYLKRFKVMKIKVIi 
RSWCRQILKGLQFLHTRTPPI 1HRDLKCDNI FITGPTGSVK1GD 
I/3LATLKRASFAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCM 
I>EMATSE YP YS ECQNAAQ I YRRVTSGVKPAS FDKVAI PEVKEI I 
EGCIRQNKDER YS I KDLLNHAFFQEETGVRVELAEBDDGEKIAI 
KLWLR IEDI KKLKGK YKDNEAI EFS FDLERNVPEDVAOEMVESG 
YVCEGDHKTMAKAIKDR VS hXKRKRBQRQh * 


5455 


1359 


377 


LTMVSPATRKSLPKVKAMDFITSTAlLPLLFGCLiGVFGl.FRLLQ 
WVRGKAYLRNAWVITGATSGUSKECAKVFYAAGAKLVIiCGRNG 
GALEELIRELTASHATKVQTHKPYLVTFDLTDSGAIVAAAAEII, 
- Q CFGYVD I LVNNAG I S YRGTIMDTTVDVDKRVMETNYFGPVAIjT 
XAI*LPSM I KRRQGHI VAISSIQGKMS IPFRSA YAASKHATQAFF 
DCLRAEMEQYE I EVTVI S PGYIHTNLSVNAI TADGSRYG VMDTT 
TAQGRS P VEVAQDVLAAVGKKKKDVT LADLLPSLAVYLRTIAPG 
LFFSLMASRARKERKSKNS 


5456 


2 


2332 


CGAG L VAAG AVL VL Y PASRAGERTRV ?3S PAPSS LPLH5 PGACG 
TEVDMDPQRSPLLBVKGNIELKRPLIKAPSQLPLSGSRLKRRPD 
QMEDGLEPEKKRTRGLGATTKITTSHPRVPSIjTTVPQTQGQTTA 
OKVSKKTGPRCSTAIATGLKNQKPVPAVPVQKSGTSGVPPMAGG 
KKPS KRPAWD L.KGQL>CDI»UAELKRCRERTQTLDQ enqqlqdqijr 
DAQQQ\0CAI^TERTTI i EGHl^AKVQAQAEOGQQELKNLRACVI J EL 
EERLSTQEGLVQELQKKQVELQEERRGUVISQLEEKERRLQTSEA 

QELKGNIRVFCRVRPVLPGEPTPPPGLLLFPSGPGGPSDPPTRL 
SLSRSDERRGTLSGAPAPPTRHDFSFDRVFPPGSGQDEVFEEIA 
MLVQSALDGYPVCI FAYGQrGSGKTFTMEGGPGGDPOI^EGli I PR 
ALRHL»FSVAQEI»SGQGWTYS FVASYVE I YNBTVRDLLATGTRKG 
QGGECEIRRAGPGSEEi»TVTNARYVPVSCEKEVDALLHLARQNR 
AVARTAQNERSS RSHS VFQLQI SGEHS SRGLQ CGAPLS LVDLAG 
SERLDPGLALGPGERERLRETQAINSSLSTLGLVIMALSNKESH 
VP YRNSKLTYliLQNSLGGSAKMIrMF VN I S PLEENV3 ESIiNS LRF 
ASKVEPSVLFGTAQSr^KVJKTDPDI,CVCVCrVCVCVCVCVCVCVP 
MSM YRVRGGR VAGGCFIG WRAPCPRAI X 


5457 


2 


1540 


DDFVERRRWTRTTCL»VRSPPHVPVCGHACSWNGGSLDPLKGTPA 
LLRSAERL^KVKKIjRLDKENTGSWRSFSLNSEGAERMATTGTP 
TADRGDAAATDDPAARFQVQKHSWDGLRSIIHGSRKYSGLIVNK 
APHDFQF VQKTDESGPHS HRL Y YLGM P YG SRENSLIiYS E I PKKV 
RKEALLLLSWKQMLDHFQATPHHGVYSREEELLRERKRLGVFGI 
TSYDFHSESGLFIiFQASNSliFHCRDGGKNGFMVSPGPGCVSPMK 
PLEIKTQCSGPRMDPKICPADPAFFSFINNSDLWVANIETGEER 
RLTFCHQGLSNVIjDDPKSAG VATFV1 QEE FDRFTG Y WWCPTAS W 
EGSEGLKTLRI LYEEVDES EVEVIHVPSPALEERKTDS YRYPRT 
GSKKPKIALKLAEFQTDSQGKIVSTQEKELVQPFSSLFPKVEYI 
ARAGWTRDGKYAWAMFLDRPQQWLQLVLLPPALFIPSTENEEQA 
ASLCQS CPQECPAVCG VRGGHQRU3QCS 


5458 


6642 


4022 


FVPGLRE PQWE PAQPS ATMSAPS EEEE YARLVMEAQP EWLRAEV 
KRLSHELAETTREKIQAAEYGLAVLEEKHQLKLQFEELEVDYEA 
IRSEMEQLKEAFGQAHTNHKKVAADGE SREES L I QESASKEQ Y Y 
VRKVLELQTELKQLRNVLTNTQS ENERLAS VAQELKE I KQNVE I 
QRGRLRDD I KEY KFRE ARLLQD YSELEEENI SLQKQ VSVLRQNQ 
VE FEGLKHE I KRIiEEETEYLNS QLEDAI RLKEIS ERQLEEALET 
LKTEREQKNSLRKELSHYMSINDSFYTSHLHVSLDGLKFSDDAA 
EPNNDAEAIjVNGFEHGGIAKLP LDNKTSTP KKEGLAP PS PSI»VS 
DLLSELNISElQKLKQOI^IQMEREKAGLLATIiQDTQKQLEHTRG 
SLSEQQEKOTRLTENLSAIjRRLQASKERQTAIjDNEKDRDSHEDG 
DYYEVDIWGPEILACKYHVAVAEAGELREQLKAIiRSTHEAREAQ 
HAEEKGRYEAEGQAI.TEKVSLLEKASRQDRELIJ%RLEKEIiKKVS 
DVAGETQGSLSVAQDELVTFSEEliANXYHHVCMCNNETPNRVMli 
D YYREGQGGAGRTS PGGRTS PEARGRRSP r LLPKGLIiAPEAGRA 
DGGTGDSSPS PGSSliPSPLSDPRREPKHI YNLIAI IRDQIKHM 
AAVDRTTELSRQRIASQELGPAVDKDKEALMEEILKLKSIJLSTK 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spond ing 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V= Valine, 
W=Tryptophan, Y=Tyxosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








REQ I TTLRT VI.KAN KQTAE VALAN LKS K Y ENE KAM VTETMM KLR 
NELKALKEDAATFSSLRAMFATRCDEYITQLDEMQRQLAAAEDE 
KKTLNSLLRMAIQQKLALTQRLELLELDHEQTRRGRAKAAPKTK 
PATPSVSHTCA CAS DRA BGTGIiAUQVFCS EKHS IYCD 


5459 


316 


1262 


RGGHRLSGMASNFNDIVKQGYVRIRSRRLGIYQRCWLVFKKASS 
KGPKRLEKFSDERAAYFRCYHKVTELNNVKNVARLPKSTKKHAI 
GIYFNDDTSKTFACESDLEADEWCKVLQMECVGTRINDISLGEP 
DLLATGVEREQSERFNVYLMPSPNLGCYMGECALQITYEYICLW 
DVQNPRVKLISWPLSALRRYGRDTTWFTFEAGRMCETGEGLFJF 
QTRDGEAIYQKVHSAALAIAEQHERLLQSVKNSMLQMKMSERAA 
SLSTMVPLPRSAYWQHITRQHSTGQLYRLQDVSSPLKLHRTETF 
PAYRSEH 


5460 


45 


2037 • 


RPGCRAGELSTGSRARERVRNRVSAPCGQDSRRCDPEVLRGRSP 
GLGLAEM P S CG ACTCGAAAVRL I TS S LAS AQRG I SGGR I HMS VI, 
GRLGTFETQ1LQRAPLRSFTETPAYFASKDGISKDGSGDGNKKS 
AS EGSS KKSGSGNSGKGGNQLRCPKCGDLCTHVETFVSSTRFVK 
CEKCHH FF WLSEADS KKS 1 1 KEPESAAEAVKLAFQQKPPPPPK 
KI YNYLDKYWGOSFAKKVLS VAV YNH YKR I YNNIPANLRQQAE 
VEKQTSLTPRELEIRRREDEYRFTKLLQIAGISPHGNALGASMQ 
QQ VNQQI PQEKRGGBVhDSSHDD I KLEKSN I LLLGPTGSGKTLL 
AQ TLAKCLD VP FAI CDCTTLTQAG YVGED I ES V I AKLLQDANYN 
VEKAQQGIVFLDEVDKIGSVPGIHQLRDVGGEGVQQGLLKLLEG 
TIVNVPEKNSRKLRGETVQVDTTNILFVASGAFNGLDRI ISRRK 
NEKYLGFGTPSNLGKGRRAAAAADLANRSGESNTHQDIEEKDRL 
LRHVEARDLI EFGMI PEFVGRLP WVPLHSLDEKTLVQILTEPR 
NAVI PQYQALFSMDKCELl^VTEDALKAIARLALERKTGARGLRS 
IMEKLLLEPMFEVPNSDIVCVEVDKEWEGKKEPGYIRAPTKBS 
SEEEYDSGVEEEGWPRQADAANS 


5461 


1481 


160 


INPPPP PKS PCGRARKWRRRRR PGAPEAAVMELPSGPGPERLFD 
SHRLPGDCFLLLVLLLYAPVGFCLLVLRLFLG I HVFLVS CALPD 
SVLRRFWRTHCAVLGLVARQEDSGLRDHSVRVLISNHVTPFDH 
NIVNLLTTCSTPLLNSPPSFVCWSRGFMEMNGRGEtjVESLKRFC 
ASTRLPPTPLLLFPEEEATNGREGLLRFSSWPFSIQDWQPLTL 
QVQRPLVS VTVSDASWVS ELLWSL F VP FTVYQVRWLRP VHRQLG 
EANEE FALR VQQLVAKELGQTGTRLTPADKAEHMKRQRH PRLRP 
QSAQSS FPPSPQPS PDVQLATLAQR VKE VL PH V PLG VIQRDLAfC 
TGCVDLTITNLLEGAVAFMPEDITKGTQSLPTASAS KFPS SGPV 
TPQPTALTFAKSSWARQESLQERKQALYEYARRRFTERRAQEAD 


5462 


663 


3353 


KIKERQMSANNS PPSAQKS VLPTAI PAVLPAAS PCSSPKTGLSA 
RLSNGSFSAPSLTNSRGSVHTVSFLLQIGLTRESVTIEAQELSL 
SAVKDLVCSTVYQKFPECGFFGMYDKIIiLFRHDIWSENlLOLIT 
S ADE I HEGDIjVEWLS ALAT VEDFQ I RPHTLYVHS YKAPTFCD Y 
CGEKLWGLVRCK3LKCEGCGLNYHKRCAFKIPNNCSGVRKRRLSN 
VSLPGPGLSVPRPLQPEYVALPSEESHVHQEPSKRIPSWSGRPI 
WMEKMVMCRVKVPHTFAVHSYTRPTICQYCKRLLKGLFRC3GMQC 
KDCKFNCH KRCAS KVPRDCLGEVTFNGE PSSLGTDTD I PMD IDN 
NDINS DSSRGLDDTEEPS PPEDKMFFLDPSDLDVERDEEAVKTI 
SPSTSNNI PU^WQSIKHTKRKSST>ryi<EGWMVHYTSRDNLRK 
RH YWRLDS KCLTL FQNESGS KYYKE I PLSEILR I S S PRDFTNI S 
QGSNPHCFE I ITDTM V YF VGENNGDSSHNPVLAATGVGLDVAQS 
WKKAIRQALMPVTPQASVCTS PGQGKDHKDLSTS IS VSNCQI QE 
NVDIS TVYQIFADEVLGSGQ FG X VYGGKHRKTGRDVA1KVIDKM 
RFPTKQESQLRNEVAILQNLHHPGIVNLECMFETPERVFWMEK 
LHGDMLEMILSSEKSRIjPERITKFMVTQILVAIiRNLHFKNIVHC 
DLKPENVLLASAEPFPQVKLCDFGFAR I IGEKSFRRS WGTPAY 
lAPEVLRSKGYNRSLDMMSVGVIlYVSLSGTFPFNEDEDINDQI 
QNAAFM YPPNPWRE I SGEA I DL INNLLQ VKMRKR YS VDXSLSH P 
WLQDYQTWLDLREFETRIGERYITHESDDARWEIHAYTHNLVYP 
KHFIMAPNPDDMEBDP 


5463 1 " 


237 


1012 


LLSVTMTTSRCSHLPEVLPDCTSSAAPVVKTVEDCGSLVNGQPQ 
YVMQVSAKDGQLLSTWRTLATQS PFNDRPMCRICHEGSSQEDL 
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ID 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan Y^Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








1*S P CECTGTLGT I HR S CLEHWLS SSNTS Y CELCHFR FAVER KPR 
PLVEWLRNPGPQHEKRTLFGDMVCFLFITPIATISGWLCLRGAV 
DHLHFSSRLEAVGL I ALTVALFT I YLFWTLVS FRYHCRL YNEWR 
RTNQRVILI,I PKSVNVPSNQPS LLGLHS VKRNSKETW 


5464 




677 


S PSMNPRKKVDLKJb I IVGAIGVGKTSLLHQYVHKTFYEEYQTTIj 
GASILSKI I ILGDTTLKLQIWDTGGQERVRSMVSTFYKGSDGCI 
IAFDVTDLESFEALDIWRGDVLAKIVPMEQSYPMVLLGNKIDLA 
DRKYQSILENHLTESIKLSPDQSRSRCC 


5465 


5278 


3348 


KGDPREF I RVHREALECDYVSAHLHEW I DLI FGYKQQGPAAVBA 
VNVFHHLFYEGQVDIYNINDPLKETAT1GFINNFGQIPKQLFKK 
PHPPKRVRSRLNGDNAGISVLPGSTSDKIFFHHLDNLRPSliTPV 
KEX>KEPVGQI VCTDKGI LAVEQNKVI/I P PT WNKTFAWG YADLS C 
RLGT YES DKAMTVY ECIjS EWGQI LCAI C PN P KLV I TGGTSTWC 
VWEMGTSKE KAKT VTIiKQALLGHTDTVT CATASIA Y H I IVSGSR. 
DRTCI I WDlfNKLS FXiTQI»RGHRAP VS AI.CINELTGD I VS CAGT Y 
IHVWS INGNPI VSVNTFTGRSQQI I CC CMS EMNEWDTQMVI VTG 
HSDG WRFVJRMEFLQVPETP APEPAEVLEMQED CPEAQ I GQEAQ 
DEDS SDSEADEQS I SQDPKDTPS QPSSTSHRPRAAS CRATAAKC 
TDSGSDDSRRW3 DQLSLDEKDGFI FVNYSEGQTRAHLQGPLSHP 
HPNPI EVRNYS RLKPGYRWERQL.VFRSKI/TMHTAFDRKDNAHFA 
EVTALGISKDHSRIIiVGDSRGRVFSWSVSDQPGRSAADHWVKDE 
GGDSCSGCSVRFSliTERRHHCRNCGQLFCQKCSRFQSEIKRLKI 
SS PVRVCQNCY YNLQHERGSEDGPRNC 


5466 


3 


392 


HACAHASAHASGRI»VRWWRKRRS VMG IQTS PVLLASLGVGIiVTI» 
LGLAVGSYLVRRSRRPQVTLLDPNEKYLLRLLDKTTVSHNTKRF 
RFALPTAHHTLGkPVGKHIYLSTRIDGSLVIRPYTPVTSDEDQG 
YVDLVrKVYLKGVHPKFPEGGKMSQYLDSLKVGDWEFRGPSGIi 
LTYTGKGHFNIQPNKKSPPEPRVAKKLGMIAGGTGITPMLQLIR 
AILKVPEDPTQCFLIiFANQTEKDI 1 1.REDLEELQARYPNRFKLW 
FTLDHPPKDWAYSKGFVTADMIREHL?APGDDVLVI*LCGPPPMV 
QLACHPNLDKLGYSQKMRFTY 


5467 


2103 


4 


GEALRVGTRGCRRDLPDPQAR I F I QKKDLEEDES VTAAIILKSRG 
RSPRKI DQFCNS SNMVMGSVTFRDVAI DFSQEEWECLQPDQRTL 
YRDVMLEN YSHIi I SLAGSS I SKPDVT TLLEQBKEP WMWRKETS 
RRYPDLELKYGPEKVSPEKDTSEVNIjPKQVI KQ I STTLGI EAFY 
FRNDSEYRQFEGliQGYQEGNINQKN I S YEKLPTHTPHASLICNT 
HKPYECKECGKYFSCGSNLIQHQSIHTGEKPYKCKECGKAFQLH 
IQLTRHQKFHTGEKTFECKECGKAFNLPTQLNRHKNIHTVKKLF 
ECKECGKS FNRS SNLTQHQS IHAGVKP YQCKECGKAFNRGSNLI 
QHQK1HSNEKPF VCKECGMAFRYH YQL I EHCQ IHTGEKPFECKE 
CGKAFTLiLT KL VRHQ KI HTG E KP FE CRSCG KAF S LLNQLNRH KN 
IHTGEKPFECKECGKSFNRSSNIjVQHQS IHAGI KPYECKECGKG 
FNRG AHL IQHQKI HSNEKPF VCRECEMAFRYHCQL I EHSR I HTG 
DKPFECQDCGKAFNRGSSLVQHQSIHTGEKPYECKECGKAFRLY 
LQLSQHQKTHTGEKPFECKECGKFFRRGSNLNQHRSIHTGKKPF 
ECKECGKAFRLHMHLIRHQKi/HTGEKPFECKECGKAFRLHMQLI 
RHQKLHTGEKPFECKECGKVFSLPTOLNRHKNIHTGEKAS 


5468 


225 


2976 


SFIjTDLFQSIiAQLENLCKQLYETTDTTTRliQAEKALVEFTNS PD 
CLSKCQLL LERGS S S YSOLLAATCLTKLVSRTNN PLP LEQRI D I 
RNYVLNYLATRPKIATFVTQALIQLYAR I TKLGWFDCQKDDYVF 
RNAITDVTRFLQDSVEYCIIGVTILSQLTNEINQVSATAFLIEA 
DTTH PLTKHRK I AS S FRDS SLFDI FTLS CNLLKQASGKNLNLND 
ESQHGI*LMQLLKI*THNCLNFDFIGTS TDESSDDLCTVQ I PTS V7R 
SAFLDSSTLQLSTIGRCEYEKTCALLVQLFDQSAQSYQELIiQSA 
SASPMDlAVQEGRIiTWriVYIIGAVIGGRVSFASTDEQDAMDGEL 
VCR VLQLMNLTDS RI»AQAGNEKLEI»AMLS FFEQFRKI YIGDQVQ 
KSSKLYRRLSEVIA3LNDETMVLSVFIGKI ITNL.KYWGRCEP ITS 
KTLQLLNDLS IGYSS VRKLVKI*SAVQFMLNNHTSEHFS FLGINN 
QSNLTDMRCRTTFYTAI^RLLMVDLGEDEDQYEQFMLPLTAAFE 
AVAQMFSTNS FNEQE AKRTLVGLVRDLRG I AFAFKAKTS FMMLF 
EWIYPSYMPII^RAIEIjWYHDPACTTPVLKLMAELVHNRSQRLQ 



328 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide - 
(A- Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=?Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion} 








FDVSSPNGILLFRETSKMITMYGNRILTLGEVPKDQVYALFCLKG" 

ISICFSMLKAALSGSYVNFXSVFRLYGDDALDNALQTFIKLLLSI 

PHSDLLD Y P KLSQS Y YS LU2VLTQDHMNF I AS LEPH V I M Y I hS S 

IS EGLTALDTM VCTGCCS CLDHI VTYLFKQIiSRSTKKRTTPtiNQ 

ESDRFLHIMQQHPEMIQQMLSTVLNIIIFEDCRNQWSMSRPt*LG 

LIbLNEKYFSDLRNSIVNSQPPEKQQAMHLCFENLMEGIERNLL 

TKNRDRFTQNLSAFRREVNDSMKKSTYGVNSNDMMS 


5469 


134 


2653 


DQEFETSLVPWHbPMGWLCSGLLFPVSCLVLLQVASSGNMKVLQ 
EPTCVSDYMSISTCEWKMNGPTNCSTELRLjbYQLVFLLSEAHTC 
VPENNGGAGCVCHLIiMDDVVSADNYTLDLWAGQQLLWKGSFKPS 
EHVKPRAPGNLTVHTNVSDTLLLTWSNPYPPDNYLYNHIiTYAVN 
IWSENDPADFRIYNVTYLEPSLRIAASTLKSGISYRARVRAWAQ 
CYNTTWSEWSPSTKWHNSYREPFEQHLLLGVSVSCIVILAVCLL 
CY VS I TK I KKEWWDQ I PNPARS RLVA III QDAQGSQWE KRS RGQ 
E PAKC PHWKNCIjTKLL PCFI»EHNWKRDEDPHKAAKEMP FQGSGK 
SAWCPVE I SKTVLWPES I S WRCVELPEAPVECE3EEEVEEEKG 
S FCASPESSRDDFQEGREG I VARLTESLFLDLLGEENGG FCQQD 
MGESCLLPPSGSTSAHMPWDEFPSAGPKEAPPWGKEQPLHLEPS 
PPASPTQSPDNLTCTETPLVIAGNPAYRSFSNSLSQSPCPRELG 
PDPLIaARHLEEVEPEMPCVPQLSEPTTVPQPEPETWEQILRR^IV 
LQHGAAAAP VSA PTSG YQE FVHA VEQGGTQAS AVVGLG P PGEAG 
YKAFS SLIAS SAVSPEKCG FGAS SGEEGYKPFQDLI PGCPGDPA 
PVPVPLFTFGI»DREPPRSPQSSHLPSSSPEHLGLEPGEKVEDMP 
KPPr,PQEQATDPIiVDSLGSGlVYSALTCHLCGHLKQCHGQEDG3 
QTPVMASPCCGCCCGDRASPPTTPLRAPDPSPGGVPLEASLCPA 
SLAPSGISEKSKSSSSFHPAPGNAOSSSQTPKIVNFVSVGPTYM 
RVS 


5470 


17 


1418 


TACRIRTSjONRGIAAVKKDAVEMLAS yGLAYSLMKFFTGPMSDF " 
KNVGI.VFVNS KRDRTKAVLCMWAGAI AAVFHTLI AYSDLG Y YI 
INKLHHVDES VGS KTRRAFLYLAAFPFMDAMAWTHAG I LLKHKY 
SFLVGCASISDVIAQWFVAILLHSHLECREPLLIPILSLYMGA 
LVRCTTLCLGYYKNIHDIIPDRSGPELGGDATIRKMLSFWWPIA 
LILATQRI SRP I VNLiFVSRDLGGSS AATEAVAI LTATYPVGHM P 
YGWLTE IRAVYPAFDKNNPSNKLVSTSNTVTAAHI KKFTFVQ4A 
LSLTIiCFVMFWTPWSEKILIDI IGVDFAFAELCWPLR I FSFF 
PVPVTVRAHLTGWIJ^TLKKTFVLAPSSVLRIIVLIASLWLPYL 
G VHGATLGVGS LLAGFVGE STMDAI AAC YVYRKQKKKMENES AT 
EGEDSAMTDMPPTEEVTDIVEMREENE 


5471 


1868 


658 


RSSAPPGPQRAAAATAAAAAAGVEMAAAAAQGGGGGEPRRTEGV ' 

GPGVPGE VEMVKGQ P FDVG PR YTQLQ YI GEGAYGMVS S AYDHVR 

KTRVAI KKI S PFEHQTYCQRTIiRE IQ I LLRFRHENVI G IRD I LR 

ASTJbEANRDVYlVQDLMETDLYKLLKSQQLSNDHICYFIiYQILR 

GLKYIHSANVLHRDLKPSNLLINTTCDLKICDFGIARIADPEHD 

HTGFLTEWATRWYRAPEIMI^SKGYTKSIDIWSVGCI1J\EM^ 

NR PI FPG KH YLDQLNH I LG I LGSPSQBDLNCT INMKARNYLQSIi 

PSKTKVAWAKLFPKSDSKALDIiLDRMLTI-'NPNKRITVEEAIiAHP 

YLEQYYDPTDEPVAEEPFTFAMELDDLPKERLKELIFQETARFQ 

PGVLEAP 


5472 


1469 


753 


L YVMARYLSDEEVA VS IDRLCKANGRS PS I PFGTVRI PGRARVR 
DPQALWIFGYGSLVWRPDFAYSDSRVGFVRGYSRRFWQGDTFHR 
GSDKMPGR VVTLLEDHEGCTWGVAYQVQG EQVS KALKYLNVRE A 
VLGGYDTKEVTFYPQDAPDQPLKAIiAYVATPQNPGYLGPAPEEA 
I ATQ I I*ACRGFSGHNLE YLIiRVRDVMQLCGPQAQDEHIiAAI VDA 
VGTMLPCFCPTEQALAI*V 


54 73 


3 


2113 


FMNVKIiLI QDLEDIEQRVPVMDAQYKI I TKTAHLI TKESPQEEG 
KEMFATMS KLKEQLTKVKECYS PI.L YESQQL.L I PhBEhEKQMTS 
FYDSLGKINE I ITVLEREAQSSALFKQKHQELIACQENCKKTLT 
LIEKGSQSVQKFVTLSNVLKHFDQTRIiQRQIADIHVAFQSMVKK 
TGDWKKHVETNSRLMKKFEESRAELEKVLRIAQEGLEEKGDPEE 
LIiRRHTEFFSQLDQRVLNAFLKACDELTDILPEQEQQGLQEAVR 
KLHKQWKDLQGEAPYHLLHLKIDVEKNRFLA5AEECRTELDRET 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=sAlanine, C=Cysteine, D-Aspartic Acid, E=» 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K~ Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V- Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLMPQEGS E KI I KEHRVFFSDKG PHHLCEKRLQLIEELC VKLP V 
RDPVRDTPGTCHVTLKELRAAIDSTYRKLMEDPDKWKDYTSRFS 
EFSSWI STNETQLKG I KGEAIDTANHGEVKRAVEEIRNGVTKRG 
ETLSWLKSRLKVLTEVSSENEAQKQGDELAKLSSSFKALVTLLS 
EVE KMLSNFGD CVQYKE I VKNSLEEL I SGS KEVQEQAEKILDTE 
NLFEAQQLLLHHQQKTKR I SAKKRDVQQQ I AQAQQGEGGLPDRG 
HEELRKLESTLDGLERSRERQERRIQVTLRKWERFETNKETWR 
YLFQTGSSHERFLSFSSLESLSSELEQTKEFSKRTESIAVQAEN 
LVKEASEI PLGPQNKQLLQQQAKS I KEQVKKLEDTLEEE YVI DK 
S 


5474 


2 


780 


TPDVRQLQASRRGIAVASWCSPRWFAGEEMAFVKS3WLLRQSTI 
LKRWKKNWFDLWS DGHL 1 YYDDQTRQN I EDKVHM PMDC INI RTG 
QECRDTQPPDGKS KDCMLQ I VCRDGKTI SLCAESTDDCLAWKFT 
LQDS RTNTAY VG S AVMTDET SWS S P P P YTAYAA? APE VGRTLS 
LQQAYGYGPYGGAYPPGTQWYAANGQAYAVPYQYPYAGLYGQQ 
PANQVI IRERYRDNDSDLALGMIAGAATGMALGSLFWVF 


5475 


2 


506 


ARGWLESLSLTCQTTPPPSS PCLLHSPSTFI HTMPPNLTGYYRF 
VSQKNMEDYLQAI.NISIAVRKIALLLKPDKEIEHQGNHMTVRTL 
STFRNYWQFDVGVEFEEDLRSVDGRKCQTIVTWEEEHLVCVQK 
GEVPNRGWRHWLEGEMLYLELTARDAVCEQVFRKVR 


5475 


192 


1457 


SDSMSLLDCFCTSRTQVESLRPEKQSETSIHQYLVDEPTLSWSR 
PSTRASEVLCSTNVSHYELQVEIGRGFDNLTSVHLARHTPTGTL 
VTIKITNLENCNEERLKALQKAVILSHFFRHPNITTYWTVFTVG 
S WLWVI S P FMA YGSASQLLRTYFPEGMS ETLIRNI ^FGAVRGLN 
YLHQNGCIHRSIKASHILISGDGLVTLSGLSHLHSLVKHGQRHR 
AVYDFPQFSTSVQPWLSPELLRQDLHGYNVKSDIYSVGITACEL 
ASGQVP FQDMHRTQMLLQKLKGPPYS PLDI S IFPQSESRMKNSQ 
SGV USG I GESVLVS SGTHTVNSDRLHTPSSKT FS PAFFSLVQLC 
LQQDPEKRPSASSLLSHVFFKQMKEESQDSILSLLPPAYNKPSI 
S LPP VLPWTEP ECD FPDEKDS YWBF 


5477 


3 


1044 


RGNSRLRYSHEDELQLPRLPELFETGRQLLDEVEVATEPAGSRI 
VQEKVFXGLDLLEKAAEMLS QLDLFSRNEDLEE IAS TDLKYLL V 
PAFQGALTMKQVNPSKRLDHLQRAREHFINYLTQCHCYHVAEFE 
LPKTMNNS AENHTANS SMAYPS LVAMASQRQAKI QRYKQKKELE 
HRLSAMKS AVESGQADDERVRE YYLLHLQRWID I SLEEIES IDQ 
EIKILRERDSSREASTSNSSRQERPPVKPFILTRNMAQAKVFGA 
GYPSLPTMTVSDWYEQHRKYGALPDQGIAKAAPEEFRKAAQQQE 
EQEEKEEEDDEQTLtlRAREWDDWKDTHPRGYGNRQNMG 


5478 


2 


835 


KTV^IWVPNVKGESTVFRAHTATVRSVHFCSDGQSFVTASDDKT 
VKVWATHRQKFLFSLSQHINW VR CAKFSPDGRLI VS ASDDKTVK 
LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 
DVRTHRLLQHYQLHSAAVNGLS FHPSGN YLI TASSDSTLKILDL 
MEGRLLYTLHGHQGPATTVAFSRTGEYEASGGSDEQVMVWKSNF 
D I GDHGEVTKVPRPPATLAS SMGNLTVS ILEQRLTLEEDKLKQC 
LENQQL I MQRATP 


5479 


2 


835 


KTVRIWVPNVKGESTVFRAHTATVRS VKFCSDGQS FVTASDDKT 
VKVWATHRQ KFLFSLSQHINWVR CAKFS PDGRI, I VS ASDDKTVK 
LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 
DVRTHRLLQHYQLHSAAVNGLS FHPSGNYL I TASSDSTLKILDL 
MEGRLLYTLHGHQ^PATTVAFSRTGEYFASGGSDEQVMVWKSNF 
D I GDHGEVTKVPRP PATLASSMGNLTVS I LEQRLTLEEDKLKQC 
LENQQL I MQRATP 


5480 


444 


1952 


LS LTS RMEE AELVKGRLQAITDKR KI QEE I SQKRLKI e EDKLKH 
QHLKKKALREKWLLDG ISSGKEQEEMKKQNQQDQHQ I QVLEQS I 
LRLEKEIQDLEKAELQISTKEEAILKKLKS IERTTEDIIRSVKV 
EREERAEES IEDI YANI PDLPKS YI PSRLRKE INEEKEDDEQNR 
KALYAMEI KVEKDLKTGESTVLSS I PLPSDDFKGTGIKVYDDGQ 
KSVYAVSSNHSAAYNGTDGLAP VEVEELLRQASERNSKS PTEYH 
EPVYANPFYRPTTPQRET VTPGPNFQER I KI KTNGLGI GVNES I 
HNMGNGLSEER GNN FNHI S PI P PV PHPRS VI QQAEE KLHTPQKR 
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seq 

ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

icbpuiiuincf 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E=r 
Glutamic Acid, F= Phenyl alanine , GrrGlycine, 
H=Histidine, I = Isoleucine, K=I>ysine, 
L=Leucine, M=Methionine, N=Asparacine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine , T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








LMTPWEESNVMQDKDAPSPKPRLSPRETIFGKSEHQNSSPTCQE 
DEEDVRYNIVHSLPPDINDTEPVTMIFMGYQQAEDSEEDKKFLT 
GYDGIIHAELWIDDEEEEDEGEAEKPSYHPIAPHSQVYQPAKP 
TPLPRKRSEASPHEKHKS 


5481 


3 


1422 


NSPGSVCLCQCVCPSl.LHCIiPPLLbDLl.IiPLLbHESPQPPALRV 
VATSSDP^FMNKHQKPVDTGQRFKTRKRDEKEKFEPTVFRDTLV 
QG LNEAGDDLE AVAKFLDSTGS RLD YRR YADTLFD I LVAGSM LA 
PGGTRIDDGDKTKMTWHCVFSAWEDHETIRNYAQVFNKLIRRYK 
YLEKAFBDEM KKL LL PLKAFS ETEQTKIiAMIiS GIIjIXSNGTLPAT 
1 LTSLFTDSLiVKEG IAAS FAVKLPKAWMAEKDANS VTSSLRKAN 
LDXRLLELFPVNRQSVDHFAKYFTDAGLKELSDFLRVQQSLGTR 
KELQKELQERLSQECP I KEWLYVKEEMKRNDLPETAVIGLLWT 
CI MNAVEVJNKKEELVAEQALKHLKQYAPLLAVFSSQGQSELI LI> 
QKVQEYCYDNIHFMKAFQKIWLFYKADVLSEEAI LKWYKEAHV 
AKGKSVFLDQMKKFVEWLQNAEEESESEGEEN 


5482 


1492 


528 


THVVMTGMC Y APHQ VL>S Y I NG VTTS KPG VSL>V Y SM PS RN LS LRIi " 
EGLQEKDS G P YSCS VNVQDKQGKSRGHS I KTLE LNVLVP PAP PS 
CRLQGVPHVGANVTLS CQS PRSKPAVQYQWDRQLF S FQTFFAPA 
LDVIRGS LShTNLS S S MAG VYVC KAHNEVG TAQ CNVTLE VSTG P 
GAAWAGA WG TLVGLGLLiAGL VLli Y HRRG KALEEPAND I KEDA 
I A PRTLP W P KS SE)T I S KNGTLSS VTS ARALR PPHG PPR PGALTP 
TPSLS SQAI*PS PRLPTTDG AH PQ PIS PI PGGVSS SG LSRMGAVP 
VMVP AQSQAG S I»V 


5483 


1 


788 


FFFFKGCRAGRGNESDYRKI4EEMHQRFLVSERSKDDLQLRL1TRA 
ENRIKQLETDSSEEISRYQEMIQKLQNVLESERENCGLVSEQRIi 
KLQQENKQLRK2TESLRKIALEAQKKAKVKI STMEHEFS I KERG 
FBVQLREMEDSNRNSIVELRHLLATQQKAANRWKEETKKLTESA 
EI RI NNLKS ELS RQKLHTQELLSQLEMANEKVAENEKL I LEHQB 
KANRLQRRI>SQAEERAASASCX3LSVITVQRRKAASLMNLENI 


5484 


3 

• 


1997 


IMADMBDLFGSDADSEAERKDSDSGSDSDSDQENAASGSNASGS 
ESDQDERGDSGQPSNKELFGDDSEDEGASHHSGSDNHSERSDNR 
SEASERSDHEDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSE 
AEGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDEERAQGSDEDK 
LQNS DDDEKMQNTDDEERPQLSDDERQQLS EEEKANSDDERPVA 
SDNDDEKQNSDDEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 
EEEODHKSESARGSDSEDEVLRMKRKNAIASDSEADSDTEVPKD 
NSGTMDLFGGADDISSGSDGEDKPPTPGQPVDENGLPQDQQEEE 
PIPETRIEVEIPKVNTDLGNDIjYFVKLPNFIiSVEPRPFDPQYYE 
DEFEDEEMLDEEGRTRLKLKVENTIRWRIRRDEEGNEI KESNAR 
IVKWSDGSMSLHIXaNEVFDVYKAPLQGDHNHLFlRQGTGLQGQA 
VFKTKLTFRPHS TDSATHRKMTLSIiADRCSKTQ KI R ILPMAGRD 
PECQRTEMIKKEBERLRASIRRESQQRRMREKQHQRGLSASYLE 
PDRYDEEEEGEESISIiAAIKNRYKGGIREERARIYSSDSDEGSE 

EDKAQRLLKAKKLTSDEVRPNLFNSRGLSCTQEPTALNEELTDQ 
AGTN 


5485 


161 


1074 


KRK ILSSMMDSEAHEKRPP J.LTSSKQDI S PHITNVGEMKH YLCG 
CCAAFNNVAITFPIQKVLFRQQLYGIKTRDAILQLRRDGFRWLY 
RGI LPPLMQKTTTLALMFGliYEDLS CLLHKHVSAPE FATSGVAA 
VLAGTTEAI FTPLERVQTLLQDHKHHDKFTNTYQAFKALKCHGI 
GEYYRGLVPILFRNGLSNVLFFGLRGPIKEHLPTATTHSAHLVN 

DFICGGLLGAMLG FLFFP 7 MWKTR TD^n T fiRP n»ric? i?ninn?n vt 
WLERDRKLINLFRGAHLNYHRSLISWGIINATYEFLLKVI 


5486 


1404 


142 

!.. 


IPGSTISWSPAAARGbSVCRCCRIiHPASAMDLFGDIiPEPERSPR 
PAAGKEAQKGPLLFDDLPPASSTDSGSGGPLLFDDLPPASSGDS 
GSLATS I SQMVKTEGKGAKRKTSEEEKNGS EELVEKKVCKAS S V 
I FGLKGYVAERKGEREEMQDAHVILNDITEECRP PSSLITRVS Y 
FAVFDGHGGI RASKFAAQNIiHQNLIRKFP KGDVI S VEKTVKRCL 
LDTFKHTDEEFL KQAS S QKPAW KDGS TATCVLA VDN I L YI ANLG 
DSRAI LCRYNEBSQKHAALSLSKEHNPTQYEERMR I QKAGGNVR 
DGRVLGVLEVSRS IGDGQYKRCGVTS V PD I RRCQ JjT PNDRF I Iili 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl eotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C*=Cysteine, D=Aspartic Acid, B- 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K^Lysine, 
I*=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion) 








ACDGLFKVFTPEEAVNFI DSCLE DEKIQTREGKSAADARYEAAC 
NRLANKAVQRGSADNVTVMWRIGH 


5487 


535 


182 


AVSLEQ I RGLQTPAP VPLPLQPCPSNCDMERVTLALDLIiAGLTA 
LEANDPFANKDDPFY YDWKNLQLS GL I CGGI»LAI AG IAAVLSG K 
CKCKSSQKQHSPVPEKAIPLITPGSATTC 


5486 


1072 


259 


AMAASGEPQRQWQEEVAAVWVGSCMTDLVSbTSRLPKTGETIH 
GHKFFI GFGGKGANQC VQA I U^GAMTSMVCKVGKDS FGPID YIEN 
LKQKDI STE FT YQTKDAATGTAS I IVNNEGQNI IVIVAGANIiLL 
NTEDLRAAANVI SRAKVMVCQLEITPATSLEALTMARRSGVKTL 
FNPAPAIADLDPQFYTLSDVFCCNESEAE I LTGLTVGSAADAGE 
AALVLLKRG CQWI ITLGAEGCWLSQTEPE PKH IPTEKVKAVD 
TTVSFKI 


5489 


81 


893 


GKG P VAAF I DQSN I FI/TDP X I FLGQWR EE P KM PLLLLGE TEPLK 
LERDCRS PVEP WAAAS PDLALACLCHCQDLS SGAFPNRGVLGGV 
LFPTVEMVI KVFVATSSGS IAIRKKQQEWGFLEAJIKI DFKELD 
IAGDEDNRRWMRENVPGEKKPQNGI PLPPQI FNEEQYCGDFDS F 
FSAKEENIIYSFLGLAPPPDSKGSEKA3EGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEETSEIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5490 


81 


893 


GKGPVAAFIDQSN1 FLTDPKI FLGQWREEPKWPLLLLGETEPLiK 
LERDCRS PVEPV7AAAS PDIAIACLCHCQDLSSGAFPNRGVLGGV 
LFPTVEMVIKVFVATSSGSIAIRKKOQEWGFLEANKIDFKELD 
I AGDEDNRRWMRENVPGE KKPQNG I PLP PQI FNEEQYCGDFDS F 
PS AKEEN I T YSFLGLkAPP P DS KGSEKAEEGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEETEEIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5491 


204 


1194 


GS APRLS LGPTG AQARDFDW WARPPSR P YTQS KEDR PDTEGRS E 
QGDMAS S FLP AG A 1 TGDS G GE LS SGDDS G EVEFPHS PE IEETS C 
LAELFEKAAAHLQGDIQVASREQLLYLYARYKQVKVGNCNTPKP 
SFFDFEGKQKWEAWKALGDSSPSQAMQEYIAWKKLDPGWNPQI 
PEKKGKEANTGFGGPVISSLYHEETIREEDKNI FDYCRENN IDH 
ITKAI KS KNVDVNVKDEEGRALL.HWACDRGHKEDVTVLLQHRAD 
INOQDNEGQTAI»HYASACEFLDIVELI»LQSGADPTLRDQDGCIjP 
EEVTGCKTVSLVLQRHTTGKA 


5492 


3 


1896 


ASKNPLSAVCTTGIMSSbAVRDPAMDRSLRSVFVGNIPYEATEE 
QLKDIFSEVGSWSFRLVYDRETGKPKGYGFCEYQDQETALSAM 
RNLNGREFSGRALRVDNAASEKNKEELKSLGPAAPIIDSPYGDP 
IDPEDAPES I TRAVASIiPPEQMFELMKQMKLCVQNSHQEARNML 
LQNPQIiAYAliDQAQ WMRIMDPE I ALK I LHRKIHVT P L I PGKSQ 
SVSVSGPGPGPGPGLCPGPNVLLNQQNPPAPQPOHIJ^RPVKDI 
PPLMQTP IQGGI PAPGP I PAAVPGAGPGSIiTPGGAMQPQUGMPG 
VGPVPLERGQVQMSDPRAPIPRGPVTPGGLPPRGLLGDAPNDPR 
GGTDLSVTGEVEPRGYLGPPHQGPPMHHASGHDTRGPSSHEMRG 
G PLGDPRLI* IGEPRGPMI DQRGLPMDGRGGRDSRAMETRAMETE 
VLETRVMERRGMETCAMETRGMEARGMDARGLEMRGPVPSSRGP 
MTGGIQGPGP INIGAGGPPQGPRQVPGI SGVGNPGAGMQGTG I Q 
GTGMQGAG I QGGGMQGAG IQGVS IQGGG IQGGGIQGASKQGGSQ 
PSSFSPGQSQVTPHQDQEKAADIMQVLQLTAJ^I^MLPPEQRQSI 
LILKEQIQKSTGAS 


5493 


1 


1876 


RAPMMTKAVPEEPRKPGRLTQALNSPLTWEHVWICVPGGTPDCL 
TDTFRVKRPHLRRSASNGHVPGTPVYREKEDMYDEI IELKKSLH 
VQKSDVDIMRTXI^IiEEENSRKDRQIEQIjLDPSRGTDFVRTLA 
EKRPDASWVINGLKQRILXDEQQCKEKDGTISKLQTDMKTTNLE 
EMRIAMETYYEEVHRLQTLIiASSETTGKKPLGEKKTGAKRQKKM 
GSALLSLSRS VQELTEENQS LKEDLDRVLSTS PTI SKTQGYVEW 
SKPRUjRRIVELEKKLSVMESSKSHAAEPVRSHPPACIASSSAL 
HRQPRGDRNKDHERI»RGAVRDLKEERTALQEQLI*QRDLEVKQLL 
QAKADLEKELEO^REGEEERREREEVLREEIQTLTSKLQEIjQEM 
KKEEKEDCPEVPHKAQEIiPAPTPSSRHCEQDWPPDSSEEGLPRP 
RS P CS DGRRDAAARVLQAQWKVYKH KKKKAVLDEAA WLQAA PR 
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SEQ 
ID 
NO: 


irxrecti c r.ea 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glyc.ine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Gluta-nine, R=Arginine, 
S-Serine, T=Threonine, V=*Valinc, 
W=Tryptophan, Y-Tyrosine, X=Unknowr. f *=Stop 
Codon, /=po3sible nucleotide deletion, 
\=possible nucleotide insertion) 








GHLTRTKLIaAS kahgs bpps vpglpdos sp VPKVPS P I AQATGS " " 
PVQEEAI VI I QS ALRAHLARARH SATGKRTTTAAS TRRR S ASAT 

HGDASSPPFLAALPDPSPSGPQAVAPLPGDDVNSDDSDDIVIAP 
SLPTKNFPV 


5494 


71 


536 


RSKAKIGTPTREVPSTDMKVRRESSSSLTHRPAPSPATPRLIjGT 
RRVLLGVSEGTGCADAMELVLVFLCS llapmvlasaaekekekd 

pfhydyqtlrigglvfawlfsvgillilsrrckcsfnqkprap 
gdeeaqveniiitanatepqkaen 


j 5495 


273 


2168 


dsllliqvdtmpftlhlrsrlpsairslilqkkpnirntssmag 
elrpaslwlprslapaferfcqvntgplpllgqsepekwmlpp 
qgaisetrmghpqfwkyefgactgslasi/eqyseqlkdmvaffl 

GCSFSLEEALEKAGLPRRDPAGHSQAGAYKTTVPCVTHAGFCCP 
LWTMRPIPKDKIiEGLVRACCSLGGEQGQPVHMGDPELLGI KEL 
S KPAYGDAMVCPPGEVP VFWPS PI/TSLGAVSSCETPLAFAS IPG 
CTVMTDLKDAKAP PGCLTPER1 PEVHHISQDPLHYS IAS VSASQ 
KIRELESMIGIDPGNRGIGHLLCKDELLKASLSLSHARSVLITT 
GFPTH FNHEPPEETDGP PGAVAL VAFLQAIiEKE VAI I VDQRAWN 
LHQK I VEDAVEQG VLKTQ I P IIiTYQGG S VEAAQAFLCKNGDPQT 
PRFDHLVAIERAGRAADGN YYNARKMNI KHLVDP I DDI»FLAAKK 
IPGISSTGVGDGGNELGMGKVKEAVRRHIRHGDVIACDVEAPFA 
VI AG VSN WGG YALACAL Y I L YS CAVHSQY LRKAVG PS RAPGDQA 

WTQALPS VI KBEKMLGI LVQHKVRSGVSGI VGMEVDGLP FHNTH 
AEM I QKI»VDVTTAQV 


5496 


3 


2408 


QDTKMHEI YKGNI TPQLNKNTLKTSAATDVWAVY FSQFW I DY3G 
MKSGKGRPI SPVDS FPLS I WICQPTRYAESQKBPQTCNQVSLNT 
SQSESSDLAGRLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 
PSPSSSEADIHLLVHVHKHVSMQINHYQYLLLIiFLHESLILLSE 
NLRKDVEAVTGS PASQTS I CIGILLRSAELALLLH P VDQANTLK 
SPVSESVSPWPDYLPTENGDFIiSSKRXQISRDINRIRSVTVNH 
MSDKRSMSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYLSDKH 
LGKISEDESSGLVYKSGSGEIGSETSDKKDSFYTDSSSVLNYR3 
DSNILSFDSDGNQNILSSTLTSKGKETIESIFKAEDLLPEAASL 
SENLD I S KEET P P VRTLKSQS S LSG KPKERCP PNIiAPLCVS YKN 
MKRSSSQMSDDTISLDSMILEEQDLESDGSDSHMFLEKGNKKWS 
TTNYRGTAES VNAGANLQNYGETS PDA I STNSEGAQENHDDLMS 
VWFKI TG VNGE I D I RGEDTE I CLQVMQVT PDQLGN I SLRH YLC 
NRPVGSDQKAV I HS KSS PE I SLR FESG PGAVIHS LLAEKNG FLQ 
CH I KN FS TE FLTS S LMN I QHFLEDE TVATVM PMK I QVS NTK I NL 
KDDS P RS S TVSLEPAPVTVHI DHLWERS DDGS FHIRDS HMLNT 
GNDLKENVKS DSVLLTSGKYDLKKQRS VTQATQTS PGVP WPSQS 
ANFPEFSFDFTREQLMEEiraSLKQEIiAiCAKMALAEAHLEKDALL 
HHIKKMTVE 


5497 


1821 


3308 


S I SKLIiKRRSNIDAYLLSNS CAFFAPRJjFS LASQ I IREQQS PNV" "' 
CFrYKYSGFPSLECQCHFVSPHSSCYINFFSFPPPFFVCFQLSN 
GFSHYSLSSESHVGPTGAGLFPHCLPASRLLPRVTSVHLPDYAH 
YYTIGPGMFPSSgiPSWKDKAKPGPYDQPLVNTLQRRKEKREPD 
PNGGGPTTASGPPAAAEEAQRPRSMTVSAATRPGEEMEACEELA 
IiALSRGLQLDTQRSSRDStjQCSSGYSTQTTTPCCSEDTIPSOVS 
DYDYFSVSGDQEADQQEFDKSSTI PRNSDISQSYRRMFQAKRPA 
STAGLPTTLGPAMVTPGVATIRRTPSTKPSVRRGTIGAGPIPIK 
TPVI PVKTPTVPDLPGVLPAPPDGPEERGEHSPESPSVGEGPQG 
VTSM PS SMWSGOASVMP PLPG PKP Q T PJ? ptj ona t d n c t? n crviu r» 
EPPSATVSPGQ I PES D PAOLS PRDTPQG EDMLNAI RRG VKLKKT 
TTNDRS APR FS 


54 98 


2434 


1492 


ILTHQEIFTGEKPCECGKASIQMSHLSQQKIYSGENPFACKVCG*^ 

KVFS HFCSNLTEHEH FHTRE KP FE CNE CGKAFS QKQ YV I KHQNTH 

TGEKLFECKECGKSFSQKENLLTHQKIHTGEKPFECKDCGKAFI 

QKSNLIRHQRTHTGEKPFVCKECGKTFSGKSNLTEHEKIHIGEK 

PFKCSECGTAFGQKKYLIKHQNIHTGBKPYECNECGKAFSQRTS 

LIVHVRIHSGDKPYECNVCGKAFSQSSSLTVHVRSHTGEKPYGC 

NECGKAFSQFSTLALHIiRI HTGKKP YQCS ECGKAFSQKSHH I RH 
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SEQ 
ID 

NO: 


Predicted 
beginning j 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal, peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=:Methionine , N=Asparagine , 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T= Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKIHTH 


5499 


324 


926 


GFGQXGRGHKITTYPFSPRKSGRKGMAQSQGWVKRYIKAFCKGF 
FVAVPVAVT FLDRVACVARVEGASMQPSLNPGGS QSSDVVLLNH 
WKVRNFEVHRGDI VSLVS P KNPEQKI I KRVIALEGDI VRTIGHK 
NR YVKVPRGH I WVEGDHHGH5FDSNS FGPVS LGLLHAHATH I LW 
PPERWQKLESVLPPERLPVQREEE 


5500 


1978 


1286 


KPDWRLQNLP PRL YLWRS S RFG FGHLKKRLQMDFKI EHTWDG FP 
VKHEPVFIRLNPGDRG VMMD ISAPFFRDPPAPLGEPGKP FNELW 
DYEWEAFFLNDITEQYLEVEIiCPHGQHIiVLIiLSGRRNVWKQEL 
PLSFRVSRGETKWEGKAYLPWS YFP PNVTKFNS FAIHGSKDXRS 
Y EAL YP VPQHBLQQGQKPDFHCLE YFKS FNFNTLLGEE W KQPSS 
DLWLIEKCDI 


5501 


2927 


2226 


CRPPVSARVAPGHQGAVGGSGRRPARVEWDAAARPSSRPFSLP 
AAlMIJULiISRLI*DWFRSIiFWKEEMEIjTLVGIiQYSGKTTFVNVIA 
SGOPSEDMT PTVGFNMRKVT ^CC?NVT I if TWDTflROPRFR^MMPPY 

O vjP^ k OJui/l 11 (.XV \J JT V L. Aul» V J. X IV1 riUl-VjO^c Ivt XxOl iirV Hf 1% X 

CRGVNAI VYMIDAADREKI EASRNELHNLLDKPQLQGI PVLVLG 
NKRDLPNALDEKQLI EKMNLS A I QDRE I CCYS I S CKEKDNID I T 
LQWLIQHSKSRRS 


5502 


3 


824 


NSAFPVWVPERTAT.r.TPPr.CJAaPGCI^REAPRTaGPPMC'TaMQTrT. 

J.i 0-T-YI7 r V a V r LZ»JTv J. /V1JJJ X * — r i JulO-rti VJOO tviL^VA O -L >V-J7 tr L X#Vrli9 nii 

GKFFKGGGSSKSRAAPSPQEALVRLRETEEMLGKKQEYLENR I Q 
REIALAKKHGTQNKRAALQALKRKKRFE KQLTQIDGTLSTIEFQ 
REAL, ENS HTNTE VLRNMG FAAXAMKS VHENMDLNK I DDLMQE I T 
EQQDIAQEI SEAFSQRVG FGDDFDEDELMAELEELEQEELNKKM 
TNIRLPNVPSSSLPAQPNRKPGMSSTARRSRAASSQRAEEEDDD 
IKQLAAWAT 


5503 


216 . 


654 


KGVRRRGRVRSDSEDSHLGY FKMSFIiLP KLTSKKEVDQAIKSTA 
EKVLVIiRFGRDEDPVCLQLDDILSKTSSDLSKMAAIYLVDVDQT 
AVYTQYFDISYIPSWFFFNGQHMKVDYGGEDPALRSIKAVRRT 
SPAGTLGEKPVKS 


5504 


58 


3563 


QLSFSFQAPVTFDDITVYLIjQEEWVLIjSQQQKELCGSNKLVAPL 
GPTVANPEI.FRKFGRGPEPWLGSVOGQRSliIiEHHPGKKQMGYMG 
EMEVGGPTRESGQSLPPQKKAYIjSHLSTGSGHIEGDWAGRNRKL 
LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREYPSrRDKRSRL 
I EGYTGP FKVETLKYKAKS KAHMFC VNALAARDP I WAARFRS IR 
DPPGDVIiAS PEPLFTADCP I FYPPGPLGGFDSMAELI.PSSRABI1 
BDPGGDGAI PAM YLDCT SDLRQKEI TDG XHSSSD IWIIi YNOAVE 
SCIQDPSAEGIiSEBVPWFEELPWFEDVAVYFTREEWGMLDKR 
QKELYRDVMRMNY ELLAS LG PAAAKPDL I S KLERRAAP WI KDPN 
GPKWGKGRPPGNKKMVAVREADTQASAADSAIiIiPGS PVEARASC 
CSSSICEEGDGPRRIKRTYRPRSIQRSWFGQFPWLVIDPKETKL 
FCSACIERPNLHDKSSRLVRGYTGPFKVETLKYHEVSKAHRLCV 
NTVEI KEDT PHTALV P E I SS DLMANMEH F FNAAYS IAYHSRPLN 
DFEKIU2LLQSTGTVILGKYRNRTAGTQFIKYISETLKREILED 
VRNS PCVS VLLDSSTDAS EQACVGI YIRYFKQMEVKES Y1TLAP 
LYSETADGYFETIVSALDELDI PFRKPGWWGLGTDGSAMLS CR 
GGLVEKFQEVIPQLLPVHCVAHRLHLAVVDACGSIDLVKKCDRH 
I RTVFKFYQSS2*KRLNELQEG AAPLEQBI I RLKDLNAVRW VASR 
RRTLHALLVS W P ALARHLQRVAE AGGQ I GHRAKGKLKLMRGFH F 
VKFCHFLLDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVALES 
LRHQ AG P KE EE FNAS FKDGRLHG I CLDKLE VAE QR FQADRERTV 
LTGI E YLQQRFDADRP PQLKNMEVFDTMAW PSG IELAS FGNDD I 
LNLARYFECSL PTX3YS EEALLEEWLGLKTI AQHLPFSMLCKNAL 
AQHCRFPLLS KLMA VWCVP I S TSCCERGFKAMNR IRTDERTKL 
SNEVLNMLMMTAVNGVAVTEYDPQPAIQHWYLTSSGRRFSHVYT 
CAQ VPARSPASARLRKEEMGAL YVEE PRTQKPP I LP SREAAEVI* 
KDCIMEPPERLLYPHTSQEAPGMS 


5505 


3312 


1219 


NCS PRS IjSAAKMSNRNIINKLPSNLPQLQNLI KRDPPAY I EEFLQ 
QYNH YKSNVE I FKLQPNKPS KELAEL VMFMAQI SHCYPE YLSNF 
PQEVKDLLSCNHTVLDPDLRMTFCKALILLRNKNLINPSSLLEL 
F FELFRCHDKLLRKTL YTH I VTD I KN INAKH KNNKVNVVLQN FTC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C~Cysteine, D^Aspartic Acid, E=» 
Glutamic Acid, F= Phenyl a lanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Va2ine, 
W=Tryptophan, Y=Tyrcsine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








YTMLRDSNATAAKMS LDVMIBLY RRNIWNDAKTVNV I TTACFS K 
VTKI LVAALTFFLGKDEDSKQDSDS ESEDDGPTARDLLVQYATG 
KKSS KNKKKLEKAMKVliKKHRKKKKPBVFNFSAIKLIHDPQD FA 
EKLLKQLECCKERFEVKMMLMNLI SRLVG IHELFLFNFYPFLQR 
FLQPHQRBVTKI LL FAAQ ASHH LVPPEI IQSLLMTVANWFVTDK 
NSGEVMTVG INAIKEITARCPLAMTEELLLQDI^QYKTHKDKNVM 
MSARTLIHLFRTLNPQP4LQKKFRGKPTEAS I EARVQEYGE LDAK 
DYIPGAEVLEVEKEENAENDEDGWESTSLSEEEDADGEWIDVQH 
SSDEEQQE ISKKLNSMPMEERKAXAAAISTSRVLTQEDFQKI RM 
AQMRKELDAAPG KSQKRKYIEI DSDEEPRGELLS LRD I ERLHKK 
PKSDKETRIATAMAGKTDRKEFVRKKTKTNPFSSSTNKEKKKQK 
NFMMMRYSQNVRSKNKRSFREKQLAIiRDALLKKKKRMK 


5506 


1 


1531 


FKGDkCGQRGGSAPGEGGSSAWPAPAHPLPEREREREALCPGRS 
CS GGGGEETPGTT PVWS PLEGGGDEE L»R PNP YVRFP YRW WAVW 
LAAFPS J^SAGGETPEAPPES WTQLWFFRFWNAAGYAS FMVPGY 
LLVQYFRRKNYLETGRGLCPPLVKACVFGNEPKASDEVPIiAPRT 
EAAETT PM WQ AL KLL FCATGLQ VS YLTWG VLQER VMTR S YG AT A 
TSPGERFTDSQFLVLMNRVLALIVAGI^SCVIiCKQPRHGAPMYRY 
SFASLSNVLSSWCQYEALKFVS FPTQVLAKAS KVI PVMLMGKLV 
SRRS YEHWEYLTATliI S IGVSMFLLSSGPEPRSSPATTLSGLIL 
LAGYIAFDSFTSNWQDALFAYKMSSVOiyMFGVNFFSCLFTVGSI, 
LEQGALLEGTRFMGRHSEFAAHALLLS ICS ACGQLFI FYTIGQF 
GAAVFTI IMTLRQAFAI LLSCI^YGHTVTWGGLGVAWFAALL 
LRV YARGRLiKQRG KKA.VP VES P VQKV 


5507 


3704 


1271 


PRGTRRCRPAGRAS RRARRRPPCPGPAAPGSIjE I GGFGTAAG KK " 
VAVADVQFGPMRFHQDQ^/LLVFTKEDNQCNGFCRACEKAGFK 

ctvtkeaqaviacfldkhhdii 1 1 dhrnprqldaealcrs irss 
kj.senwivgwrrvdreelsvmpfisagftrryvenpnimacy 
nei^qlefgevrsqlklracnsvftalensedaisitsei>rfiq 
yanpaff/ttmgyqsgeligkei^bvpinekkadlldtinsciri 

GKEWQGIYYAKKKNGDNIQQNVKIIPVIGQGGKIRHYVSIIRVC 
NGNNKAEKISB CVQSDTHTDNQTGKHKDRRKGSLDVKAVASRAT 
EVSSQRRHSSMARIHSMTI EAPITKVINI INAAQESSPMPVTEA 
IiDRVLE I LRTTEIi YS PQFGAKDDDPHANDIiVGG LMS DGLR RLS G 
NEYVLSTKNTQMVSSNI I TP I SIJDDVPPRIARAMENEEYWDFDI 
FEI^AATH^PLIYLGLKMFARFGICEFLHCSESTLRSWLOI IE 
ANYHSSNP YHN STHSADVLHATA YFLS KER I KETVLDPIDE VAAI* 
IAATIHDVDHPGRTNSFLCNAGSELAILYNDTAVLESHHAALAP 
QLTTGDDKCNI FKNMERND YRTLRQGI I DMVDATEMTKHFEHVW 
KFVNS INKPIATLEF^GETDKNQEVINTMIjRTPENRTIilKRWLI 
KCADVSNPCRPLQYCIEWAAR ISEEYFSQTDEE KQQGIiP WMP V 
FDRNTCS I PKSQ I S FTDYFI TDMFDAWDAFVDIiPDLMQHLDIJNF 
KYWKGIiDEMKLRNItRPPPH 


5508 


1151 


691 


LS S VFS RRSASM FAVGCSMG P FLH YW YLSLDRLF PASGLiRGFPN 
VLKKVLVIX?LVASPLLGVWYFU?LGCIiEGOTVGESCQEriREKFW 
EFYKADWCVWPAAQFVNFLFVPPQFRVTYING1jTIX5WDTYI>SYL 
KYRSPVPLTPPGCVAIiDTRAD 


5509 


1238 


619 


R KS RG CQNAIjS AS G PAAAAAA I MVRKLKFHEQKLLKQ VDFLN W E 
VTDHNIjHELRVLRRYRIiQRRED YTRYNQLS RAVRELARRLRDLP 
ERDQFRVRASAALLDKLYALGLVPTRGSLELCDFVTASSFCRRR 
LPTVLLKLRMAQHLQAAVAF VEQGHVRVG PD WTDPAFL.VTRSM 
EDFVTWVDS SKI KRHVUSYNEERDDFDLEA 


5510 


96 


1195 


PAGAHI^S SGS SEPI*VEPGRGR VG AR VKGERGJjQASGSAPGRS KM ' 
AEGERQP PPDSSEEAPPATQN FI I PKKEIHTVPDMGKWKRSQAY 
ADYIGFILTLNEGVKGKKLTFEYRVSEAIEKLVALIiNTLDRWID 
ETPPVDQPSRFGNKAYRTOTAKLDEE^Ur^VATWPTHLAAAVP 
EVAVYLKESVGNSTR I DYGTGHEAAFAAFLCCIiCKIGVLRVDDQ 
lAIVTKVFNRYLEVMRKIiQKTYRMEPAGSQGVWGIiDDFQFLP FI 
WGSSQIilDHPYLEPRHFVDEKAVNENHKDYMFLECIIiFITEMKT 
GPFAEMSNQLWNISAVPSWSKVNQGIiIRMYKAECLEKFPVIQHF 
KFGSULPIHPVTSG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C~Cysteine, D=Aspartie Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Mefchionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
£=Serine, T= Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5511 


276 


1980 


KLSRVLNLPPENLITSISAVPISQKEEVADFQLSVDSLLEKDND 
HSRPDIQVQAKRIAEKLRCDTWSEISTGQRTVNEKINRELI,TK 
T VLQQV I EZX3S K YGLKS EL FSGLPQKKI WE PSS PNVAKKFH VG 
HLRSTIIGNFIANLKEAIiGHQVIRINYIXSDWGMQFGLLGTGFQIj 
FGYEEKIiQSNPr^HLFEVYVQVNKEAADDKSVAKAAQEFFQRIiE 
LGDVQALSLWQKFRDLS I EEY I RVYKRLGVY FDE YSGESFYREK 
SQEVLKLIiESKGLLIiKTIKGTAWDLSGWGDPSSICrVMRSDGr 
SLYATRDLAAAIDRMDKYNFDTMIYVTDKGQKKHFQQVFQMLK1 
MGYDWAERCQHVP FGVVQGMKTRRGDVTFLED VLNE I QLRMLQN 
MAS IKTTKELKNPQETAERVGLAALI I QD FKGLLLS D YKFSWDR. 
VFQSRGDTGVFLQYTHARLHSLEETFGCGYLNDFNTACLQEPQS 
VS I LQHLIiR FDEVLYKSSQDFQPRHI VS YLLTLSHIiAAVAHKTL 
QI KDS PPEVAGARLHLFKAVRSVLANGMKLLGITPVCRM 


5512 


120 


1015 


DPSLIiLTITVTGVTVLVLVLKSMNSRRREPITLQDPEAKYPLPI. 
I EKEKI SHNTRR FRFGLPSPDHVLGLPVGN YVQLLAKI DNELW 
RAYTP VS SDDDRGFVDL 1 1 KIYFKNVHPQYPEGGKMTQYliENMK 
I G ET I FFRG PRGR LF Yl IGPGNLG I RPDQTSE PKKTLADHLGM I A 
GGTG ITPMLQLI RKITKDPSDRTRMSblFANQTEEDI LVRKELE 
EIARTHPDQFDLWYTLDR PP I GWKYSSGF VTADM I KEHIiPPPAK 
ST L I LVCGP P P LTQTAAH PNLE KLGYTQDM I FTY 


5513 


2 


637 


ARWRLPSDSPRIPPAGAETPGRGSCRNYLPSSSPPPPEPSSFPS 
P P TSRGG PGSRDTMS DS EEESQDRQLKI VVLGDG ASG KTS LTTC 
FAQETFGKQYKQTIGLDFFLRRITLPGNIiNVTliQIWDIGGQTlG 
G KMLD KY I YG AQG VLL V YD ITKYQS FENIiEDW YT WKKVS EES E 
TQPLVAL VGNKI DLEHMRT I KPEKHLRFCQENG FS SHFVSAKTG ' 
DS VFLCFQKVAAE ILG I KIjNKAEI EQSQRWKAD I VNYNQEPMS 
RTVNPPRSSMCAVQ 


5514 


1295 


449 


VKRPS W I MGN FRGHALPGTFFF1 IGLWWCTKS I LKYICKKQKRT 
CYLGSKTliFYR^EILEGITIVGMALTGMAGEQF I PGCPHLMLYD 
Y KQGHWNQLLGWHHFTMY FFFGLLGVADI LCFT I S SLP VS IiTKL 
MLSNALFVEAFI FYNHTHGREMLDIFVHQLLVLWFL.TGLVAFL, 
EFLVRNJ3 VL LEIiliRS S Ij I IjIjQGS WFFQ IGFVLYP PSGGPAWDLM 
DHENI LFLTI CFCWH YAVTI VIVGMNYAFITWLVKSRLKRLCSS 
EVGLLKNAEREQESEEEM 


55X5 


1572 


260 


FVRLVGRGDCDPLLSVCLTTMPLYEGLGSGGEKTAWIDLGEAF 
TKCGFAG ETGPRCI I PS V I KRAGMPKP VR VVQYN INTEELYS YI* 
KEF I HILYFRHLliVNPRDRRWT IES VL.CPSHFRE TI/TRVLFKY 
FEVPSVLIJ^SHLMAlJ^TLGINSTWVItDCGYRESLVIiPIYEGIP 
VLNCWGALiPLGG KALHKEIiETQLLEQCTVDTSVAKEQSljPSVMG 
SVPEGVLEDIKARTCFVSDLKRGIiKIQAAKFNIDGNNERPSPPP 
NVDYPLDGEKlLHILGSIRDSVVEILFEQDNEKQSVATLIIJDSI* 
IQCP IDTRKQLAENIiWIGGTSMLPGFLHRLLAE IRYLVEKPKY 
KKALGTKTFRI HTPPAKANCVAWLGGAI FGALQD ILGSRS VSKE 
YYNQTGRI PDMCS LNNPPLEMMFDVGKTQPPLMKRAFSTEK 


5516 


3 


735 


NSREPPQAGPGPS PRKS PTASSFLFPWRPXiASSFWMGAQGAQES 
1 KAMWRVPGTTRRP VTGES PGMHRPEAMLLLLTLALLGGPTWAG 
KMYGPGGGKYFSTTED YDHE I TGLRVS VGLL u VKS VQVKLGDS W 
DVKLGALGGNTQEVTLQPGE Y I TKVFVAFQAFLRGMVMYTS KDR 
YFYFGKLDGQISSAYPSQEGQVLVGIYGQYQLLGIKSIGFEWNY 
PLEEPTTEPPVNLT YS ANS P VGR 


5517 


246 




SEIYVAMRTDSSKMTDVESGVANFASSARAGRRNALPDIQSSAA 
TDGTS DLPIiKLEALS VKEDAKE KDEKTTQDQL BKPQWEEK 


5518 


3 


1375 


DAWADAWVRAWDLNMDFPCLWLGLLLPLVAALDFNYHRQEGMEA ' 
FLKT VAQN YS S VTHLHS IGKS VKGRNLWVIjVVGR FPKEHRIG I P 
EFKYVANMHGDEWGRELLLHLIDYLVTSDGKDPEITNLINSTR 
IHIMPSMNPDGFEAVKKPDCYYSIGRENYNQYDLNRNPPDAFEY 
NNVSROPETVAVMKWLKTETFVXSANLKGGALVASYPFD^VQA 
TGALYSRSLTPDDDVFQYLAHTYASRNPNMKKGDECKNKMNFPN 
GVTNGYSWYPLQGGMQDYNYIWAQCFEITIiEI»SCCKYPREEKLP 
S FWNNNKASLI EYI KQVHLGVKGQVFDQNGNPI*PNVI VE VQDRK 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A= Alanine, C= Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, T= Threonine, V=Valine, 
(^Tryptophan, Y=Tyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








HI CPYRTNKYGE YYLLLLPGS Y 1 1 NVTVPGHDPHITKVI I PEKS 
QN FSALKKDI LLP FQGQLDS 1 PVSNPSCPMIPLYRNLPDHSAAT 
KPSLFLFLVSLLHIFFK 


5519 


a / 


477 


I KS KLNQQVE VQESEWRLTEAKG PTMGKE SGWDSGRAAVAAWG 
GVVAVGTVLVALSAMGFTSVG IAASS IAAKMMSTAAI ANGGGVA 
AGSLVAI LQS VGAAGLS VTSKVI GGFAGTALGAWLGS P PSS 


5520 


117 


943 


PTEGRQKVLKTFTVPRSALAMTKTSTCIYHFLVLSWYTFLNYYI 
SQEGKDEVKPKILANGARWKYMTLLNLLLQTIFYGVTCLDDVLK 
RTKGGKDI KFLTAFRDLLFTTLAFPVSTFVFLAFW I LFLYNRDL 
I YPKVLDTVI PVWLNHAMHTFI FPI TLABWLRPHS YPS KKTGL 
TLLAAAS IAY I SRI LWLYFETGTWVY P VFAKLSLLGLAAF FSLS 

YVFIASIYLLGEKLNHWKWVSVQILQRWRLESVGICFQWPDWKS 
PAKHQLVKNIR 


5521 


54 6 


911 


KILNMQKSCEENEGKPQNMPKAEEDRPLEDVPQEAEGNPQPSEE 
GVSQEAEGNPRGG PNQPGQGFKEDTPVRHliD PEEMIRG VDELER 
LREEIRRVRNKFVMMHWKQRHSRSRPYPVCFRP 


5522 


1224 


637 


GS RPLGQRSREKMW VFGYGSLI WKVDFP YQDKLVGYITN YS RRF 
WQGSTDHRGV PGKPGR WTLVED PAGCVWGVAYRLPVGKEEEVK 
AYLDFREKGGYRTTTVIFYPKDPTTKPFSVLLYIGTCDNPDYLG 
PAPLEDIAEQI FNAAGPSGRNTEYLFELANS IRNLVPEEADEHL 
FALEKLVKERLEGKQNLNCI 


5523 


3 


1280 


SKGKKRMGSSMSAATARRPVFDDKEDVNFDHFQILRAIGKGSFG" 
KVCI VQKRDTE KM YAM KYMNKQQCI ERDE VRNV FRELE I LQ EI E 
HVFLVNLWYSFQDEEDMFMWDLLLGGDLRYHLQQNVQFSEDTV 
■RliYI CEMAIiALDYLRGQHI IHRDVKPDNILLDERGHAHLTDFNI 
ATI I KDGERATALSGTKPYMAPEIFHSFVKGGTGYSFETOWWSV 
GVMAYELLRGWRPYDIHSSNAVESLVQLFSTVSVQYVPTWSKEM 
VAL LR KLLT VNPEHR LS SLQDVQAAPALAG VL WDHLSE KR VE PG 
FVPNKGRLHCDPTFELEEMILESRPIjHKKKKRIAKNXSRDNSRD 
SSQSENDYLQDCLDAIQQDFVIFNREKLKRSQDLPREPLPAPES 
RDAAE P VEDEAERS ALPMCG P I CPSAG SG 


5524 


85 


2318 


RERERDHR PGES S QGQS GAGG CF PS PTMELRCGGLLFS S RFDSG - 
NLAHVEKVESLSSDGEGVGGGASALTSGIASSPDYEFNVWTRPD 
CAETE FENGNRSWFYFSVRGGMPGKLI KINIMNMNKQSKLYSQG 
MAPFVRTLPTRPRWERI RDRPTFEMTETQFVLS FVHRFVEGRGA 
TTF FAFCYPFS YS DCQE LLNQLDQR FPENHPTHS S PLDT I YYHR 
ELLCYSLDGLRVDLLTITSCHGLREDREPRLEQLFPDTSTPRPF 
RFAGKR I FFLSS RVHPGETP S S FVFNGFLDFI LR PDDPRAQTLR 
RLFVFKLIPMLNPDGWRGHYRTDSRGVNLNRQYLKPDAVLHPA 
IYGAKAVLLYHHVHSRLNSQSSSEHQPSSCLPPDAPVSDLEKAN 
NLQKEAQCGHSADRHNAEAWKQTEPAEQKLNSVWIMPQQSAGLE 
ESAPDTI PPKESGVAYYVDLHGHASKRGCFMYGNS FSDESTQVE 
NMLYPKLISLNSAHFDFQGCWFSEKNMYARDRRDGQSKEGSGRV 
AI YKASGI IHSYTLECNYNTGRSVNS I PAACHDNGRASPPPPPA 
FPSR YTVEL FEQVGRAMAI AALDMAE CN PWPRIVLS EHS S LTNL 
RAWMLKHVRNSRGLSSTLNVGVNKKRGLRTPPKSHNGLPVSCSE 
NTLSRARSFSTGTSAGGSSSSQQNSPQMKNSPSFPFHGSRPAGL 
PGLGS STQKVTHR VLG PVRGKP VWEP LQHVFGCLGHC WGK 


"5525 


105 


834 


SNTLDFERHLFIMGQQISDQTQLVINKLPEKVAKHVTLVRESGS 
LTYEEFTX5RVAELNDVTAKVASGQEKHLLFEVQPGSDSSAFWKV 

S STS EE PDENSS S VTSCQASLWMGRVKQLTDEEECC I CMDGRAD 
L I LPCAHS FCQKCI DKWSDRHRNCPI CRLQMTGANE5 WWSDA P 
TEDDMAN Y I LNMADEAGQPHRP 


5526 


3 


853 


RRPCN PVRAAKRTGAAARA PRGLE VTMLR VAWRTLS L I RTRAVT 
QVLVPGLPGGGSAKFPFNQWGLQPRSLLLQAARGYWRKPAQSR 
LDDDPPPSTLLKDYQNVPGI EKVDDWKRLLSLEMANKKEMLKI 
KQEQFMKKIVANPEDTRSLEARIIALSVKIRSYEBHLEKHRKDK 
AH KRYLLMS IDQRKKMLKNLRNTNYDVFEK I CWGLG I E YTFP PL 
YYRRAHRRFVTKKALC I RVFQETQKLKKRRRALKAAAAAQKQAK. 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor r e spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H-Histidine, I^Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonir.e, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=CJnknovm / *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RRNPDSPAKAIPKTLKDSQ 


5527 


3225 


565 


LLRKYLLHQNPIJL.I.RHQPNRTCISFSATMKLKDTKSRPKQSSCG 
KFQTKGIKWGKWKEVKIDPNMPADGCMDDLVCFEELTDYQLVS 
PAKNPS SLFSKEAPKRKAQAVSEEEEEEEGKSSSPKKKI KLKKS 
KNVATEG TS TQ KE PE VKD PELEAQGDDM VCDDPEAGEMTS ENLV 
QTAPKKKKNKGKKGIiEPSQSTAAKVPKKAKTWIPEVHDQKAJDVS 
AWKDLFVPRPVLRALS PLGFS APTP IQALTLAPAIRDKLDILGA 
AETGSGKTLAFAXPMIHAVLQWQKRNAAPP PSNTEAP PGETRTE 
AGAETRS PG KAEAE S DAL PDDT V I E S EALP S D I AAE ARAKTGGT 
VSDOALLFGDDDAGEGPSSLIREKPVPKQNEWEEENLDKEQTGN 
LKQELDDKSATCKAYPKRPLLGLVLTPTRELAVQVKQHIDAVAR 
FTGI KTAILVGGMSTQKQQRMLNRRPE I WATPGRLWELI KEKH 
YHLRNLRQLRCLWDEADRMVEKGHFAELSQLLEMLNDSQYNPK 
ROTLVFSATLTLVHQAPARILHKKHTKKMDKTAKLDLLMQKIGM 
RGKPKVTDLTRNEATVETLTETKIHCETDEKDFYLYYFLMQYPG 
RSLVFANSISCIKRLSGLLKVLDIMPLTLHACMHQKQRLRNLEQ 
FARLEDC VLLATDVAARGLDI PKVQHVIHYQVPRTS EI YVHRSG 
RTARATNEGLSLMLIGPEDVINFKKX YKTLKKJDEDI PLFPVQTK 
YMDWKERIRLARQIEKSEYRNFQACLHNSWI EQAAAALE I ELE 
EDMYKGGKADQQEERRRQKQMKVLKKELRHLLSQPLFTESQKTK 
YPTQSGKPPLLVSAPSKSESALSCLSKQKKKKTKKPKEPQPEQP 
QPSTSAN 


5528 


3 


895 


GPFLSACRMWGACKVKVHDSLATI S I TLRRYLRLGATMAKSKFE 
YVRDFEADDTCLAHCWVVVRLDGRWFHRFAEKHNFAKPNDSRAL 
QLMCKCAQTVMEELED IV I AYGQSDE YS FVF KRKTNW FKRRAS K 
FMTHVASQFASS YVFYVJRDYFEDQPLLYPPG FDGRVWYPSNQT 
LKD YLS WRQADCHINNLYNT VFWAL IQQSGLTP VQAQGRLQGTL 
AADKN E I LFS EFNIN YNN E PPM YRKG TVLI WQKVDEVMTKE I KL 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWL 


5525 


48 


640 


TFRLVSAHLKTRKLINPEAAERRWRDWDSRQGWLSVKMQRVSGL 
LSWTLSRVLWLSGLSEPGAARQPRIMEEKALEVYDLIRTIRDPE 
KPNTLEELEWS ESCVE VQEINEEE YLVI I R FTP T VPHCS LATL 
IGLCLRVKLQRCLPFKHKLEI YISEGTHSTEED INKQ INDKERV 
AAAMENPNLREIVEQCVLEPD 


5530 


4541 


2606 


AQI VHAI S YCHKLHVGHRDLKPENWFFEKQGLVKLTDFGFSNK 
FQPGKKLTTS CGS LAYSAPE I LLGDE YDAPAVD I WSLGVI LFML 
VCGQPP FQEANDS ETLTMIMDCK YT V PSHVS KECKDLI TRMLQR 
DPKRRASLEEIENHPWLQGVDPSPATKYNIPLVSYKNLSEEEHN 
SI I QRMVLGD I ADRD A I VEALETNR YNH I TAT Y FLLAER I LR EK 
QEKE IQTRSAS PSNI KAQFRQS WPTKI D VPQDLEDDLTATPLSH 
ATVPQS PARAADS VLNGHRSKGLCDS AKKDDLPELAGPALSTVP 
PASLKPTASGRKCLFRVEEDEEEDEEDKKPMSLSTQWLRRKPS 
VTNR LTS RKS AP VLNQ I FEEGE SDDE FDMDENL P P KLS RL KMNI 
ASPGTVHKRYHRRKSQGRGSSCSSSETSDDDSESRRRLDKUSGF 
TYS WH RRDSSEG P PGS EGDGGGQS KPSNASGGVDKAS PSENNAG 
GGSPSSGSGGNPTNTSGTTRRCAGPSNSMQLASRSAGELVESLK 
LMSLCLGSQLHGSTKYI I DPQNGLS FSS VKVQEKSTWKMCI SST 
GNAGQVPAVGGIKFFSD^IADTTTELERIKSKNLKNNVLQLPLC 
EKT I S VNIQRN PKEGLLCAS S PASCCHVT 


5531 


24 


515 


GSOPRAPRPRDSMERPEPEL IR0S WRAVSRS PLEHGTVLFARLF 
ALEPDLLPLFQYNCRQFSS PF.nCT.SS PEFLDHIRKVMLVIDAAV 
TNVEDLSSLEEYLASLGRKHRAVGVKLSSFSTVGESLLYMLEKC 
LGPAFTPATRAAWSQLYGAWQAMSRGl«X5E 


5532 


3395 


1402 


SDWM WGKRKM 1 1 EDETEFCGEELLH SVLQCKS VFDVLDGBEMR 
RARTRANPYEKI RGVFFIiNRAAMKMANMDFVFDRMFTNPRDSYG 
KPLVKDREAELLYFADVCAGPGGFSEYVLWRKKWHAKGFGMTLK 
GPNDFKLEDFYSASSELFBPYYGEGGIDGDGDITRPENISAFRN 
FVLDNTDRKGVHFLMADGGFSVEGQENLQEILSKQLLLCQFLMA 
LSIVRTGGHFICKTFDLFTPFSVGLVYLLYCCFERVCLFKPITS 
RPANSERYWCKGLKVGIDDVRDYLFAVN I KLNQLRNTDSDVNL 
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SEQ 
ID 

NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K- Lysine, 
L=Leucine, M«Methionine, N=Asparagine, 
P^Proline, Q^Glutamine , R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








WPLEVIKGnHEFTDyMIRSJJESHCSLQIKALAKlHAFVQDTTlJ 
SEPRQAEI RKECLRLWG I PDQARVAPSSSDPKS KPFELI QGTE I 
DIFSYKPTLLTSKTLEKIRPVFDYRCMVSGSEQKFLIGLGKSQI 
YTWDGRQSDRWI KLDLKTELPRDTLLS VE I VHELKGEGKAQRKI 
SAI HILDVLVLNGTDVREQHFNQRI QLAEKFVKAVSKPS RPDMN 
PIRVKEVYR LEEMEKI FVRLEMKI I KGSSGTPKLSYTGRDDRHF 
VPMGL Y I VRT VN E PWTMG FSKS FKKKFF YNKKTKDSTFDLP ADS 
IAP FHI CYYGRLFWEWGDG IRVHDSQKPQDQDKLS KEDVLS FIQ 
MHRA 


5533 


94 


7B9 


MKERRAPQP WARC KLVLVGDVQCG KTAMLQVLAKDCY PETY VP 
TVFEN YTACLETEEQRVELSLWDTSGS P YYDNVRPLCYSDS DAV 
LLCFD I SRP ETVDS AL KKWRTE I LDYCP STR VLL I GCKTDLRTD 
LSTLMELSHQKQAP IS YEQG CAIAKQLGPEI YLEGS AFTS E KS I 
HS I FRTASM LCLN KPS PLPQKSP VRSLSKRLLHLPSRS EL I S PT 
FKKEKAKXCSIM 


"5534 


3 


605 


LVRGRARAANPGRVGAMDGLRQRVEHFLEQRNLVTEVLGALEAK 
TGVEKRYLAAGAVTLLS L YLLFG YGAS LLCNL IG FVYPAYAS IK 
AIESPSKDDDTVWLTYWVVYALFGIiAEFFSDLLLSWFPFYYVGK 
CAFLLFCMAPRPWNGALMLYQRVVRPLFLRHHGAVDRIMNDLSG 
RALDAAAGITRNVKPSQTPQPKDK 


5535 


1029 


332 


KSFMDSEARLCSLVELSDTQDETQKSDSENEDLKIDCLQESQEL 
NLQKLKNSERILTEAKQKMRELTVNI KMKEDLIKELIKTGWDAK 
SVSKQYTLKVTKLEHDAEQAKVELTETQKQLQELENKDLSDVAM 
KVKLQKE FRKKVDAAOKLRVQVLQKKQQDS KKLAS LS IQNE KRAN 
ELEQSVDHMKYQKIQIK2RKLQEENEKRKQLDAVIKRDQQKI KVI 
LSYI PAKYNMKC 


5536 


942 


282 


AAATAASLS PRGCRLRT PS SDVSPSRA? PPSAAPLPTGRAQMSP 
SGRLCLLTI VGLI LPTRGQTLKDTTS SS SADATI MDIQVPTRAP 
DAVYTELQPTS PTP TWFADETPQPQTQTQQLEGTDGPLVTD PET 
HKSTKAAHPTDDTTTLSERPSPSTDVQTDPQTLKPSGFHEDDPF 
F YDEHTLRKRGLL VAAVLFI TG III LTSGKCRQLSRLCRNHCR 


5537 


3 

t 


2391 


RARVS S PQLR V FRSGRPRRLR VLRINRTS VALRL AGTGR FVAXT 
PGHPGSWEMGLLTFRDVAVEFSLEEWEHLEPAQKNLYQDVMLEN 
YRNLVSLGLWS KPDLITFLEQRKE P WNVKS EETVAI QPDVFSH 
^^I^TEHCTBASFQKVISRRHGSCDLENLHLRKRWKREECEG 
HNGC YDEKTFKYDQFDESS VESLFHQQ I LSSCAKS YN FDQYRKV 
FTHSSLLNQQEE IDI WGKHHI YDKTSVLFRQVSTLNS YRNVFIG 
EKNYHCNNSEKTLNQSSSPKNHQENYFLEKQYKCKEF2EVFLQS 
MHGQEKQEQSYKCNKCVEVCTQSIjKHIQHQTIHIRENSYSYNKY 
DKDLSQSSNLRKQI IHNEEKPYKCEKCGDSLNHSLHLTQHQI 1 P 
TEEKP YKWKEOG KVFNLNCSL YLTKQQQ IDTGENL YKCKACS KS 
FTRSSNLIVHQRIHTGEKPYKCKECGKAFRCSSYLTKHKRIHTG 
EKPYKC KECGKAFNRS SCLTQHQTTHTGE KLYKCKVCSKSYAR S 
SNLIMHQRVHTGEKPYKCKECGKVFSRSSCLTQHRJCIHTGENLY 
KCKVCAKPFTCFSNLIVHERIHTGEKPYKCKECGKAFPYSSHLI 
RHHRIHTGEKPYKCKACSKSFSDSSGLTVHRRTHTGEKPYTCKE 
CGKAFSYSSDVIQHRRIHTGQRPYKCEECX5KAFNYRSYLTTHQR 
SHTGERPYKCEECGKAFNSRSYLTTHRRRHTGERPYKCDECGKA 
F S YRS YLTTHRRSHS GERPYXCEECGKAFNSRS YL IAHQRSHTR 
EKL 


5538 


926 




n^MI^J^AFWViSilPVLMLLLLLGLIDISQAQLSCTGPPAIPGIPG 
I PGTPGPDGQPGTPGI KGEKGLPGLAGDHGE FGE KGDPGI PGN P 
GKVGPKGPMGPKGGPGAPGAPGPKGESGDYKATQKIAFSATRTI 
NVPLRRDOTIRFDHVI TIOMNNNYEPRSGKFTCKVPGLYYFTYHA 
SSRGNLCVNLMRGRERAQKVVTFCDYAYNT FQVTTGGMVLKLEQ 
GENVFLQATDKNSL LGMEGANS I FSGFLLFPDMEA 


5539 


38 


1258 


HRGPSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPG " 
I VDGPAALAS FPETVPAVPGP YGPHRP PQPLP PGLDS DGLKREK 
DEI YGHPLFPLLAL VFEKCELATCS PRDGAGAGLGTPPGGDVCS 
SDSFNEDIAAFAKQVRSERPLFSSNPELDNLVIQAIQVLRFHLL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=»Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V= Valine, 
W=T ryp t ophan , Y= Tyro sine , X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ELEKVHDiiCDNFCHRYI T CLiKGKMP I DXj VI EDRDGGCREDFEDY 
PAS CP SLPDQNNMW IRDHEDSGS VHLGTPG P S SGGLiAS QSGDNS 
SDQGDGLDTS VAS PSSGGEDEDLDQERRRNKKRG I FPKVATN IM 
RAWLFQHLSHPY PS EEQKKQIAQDTGLTILQVNFNWFINARRRI V 
QPMIDQSWRTGQGAAFSPEGQPIGGYTETQPHVAVRPPGSVGMS 
LNLEGEWHYL 


5540 


148 


1440 


PPLGAGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCAIiPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPUPPGIiDSDGUCRBKDEI 
YGHPLFP LLALVFE KCEIiATCS PRDGAGAGLiGTPPGGDVCS SDS 
FNEDNTAFAKQVRSERPIiFSSNPELDNLMIQAIQVLRFHLLELE 
XGKMP I DltVIEDRDGGCR EDFED YPAS CPSL PDQNNT W IR0HED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGr FPKVATN I MRAWLFQHLSHPYPSEEQKKQ 
LAQDTGLTI LQVNNWF INARRRI VQPM I DQSNRTGQGAAFS PEG 
QPIGG YTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5541 


143 


1440 


PPLGAGAGVHARS PHPARRLPLTTAGVGGRAPDLLPT PWRQHRG 
PSGAAAPGCAL»PRGQAIaEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAAUVSFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVRS ERPL FSSN PELDNLMI QAI QVl^RFHLIf ELE 
KG KMP I DLV I BDRDGG CREDFED YPAS CPS LPDQNN I WIRDHED 
SGSVHLGTPGPSSGGIiASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRG I F P KVATN IMRAWL FQH LSHPY PS E EQKKQ 
IiAQDTGIiTILQVNNWFINAKRRIVQPMIDQSNRTGQGAAFSPEG 
QP I GGYTETEPHVAFRAPAS VGDE FGTRKEE WHYL 


5542 


148 


1440 


PPLGAGAGVHARSPHPARRLPIiTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALAS FPETVPAVPGP YGPHRPPQPLP PG1.DSDGLKREKDBI 
YGHPLFPLLAIiVFE KCELiATCS PRDGAGAGLGTPPGGD VCSSDS 
FNEDNTAFAKQVRSERPLFSSNPELDNLM I QAIQVLRFHLLELE 
KGKMPIDLVI EDRDGGCREDFEDYPASCPSLPDQNNI W IRDHED 
SGSVHLGTPGPSSGGIiASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
LAQDTGLT I LQVNNWF INARRRI VQPMIDQSNRTGQGAAF3 PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYI* 


5543 


2405 


665 


RWVREQPWP 1>RTSEAVKT PALRP FPGPRGVS PFPKPDWGKSPAP 
KRPFSDSGAFWS PERRPGVIiEAPRRRPVPAS FRAVP PKPTRVHG 
SSASRDRVLARTMIVADSECRAEIiKDYLRFAPGGVGDSGPGEEQ 
KES RARRGPRGPS API P VEEVLREGAESLEQHLGLEALMS S GRV 
DNLAWMGLHPDYFTSFWRLHYLLLHTDGPLASSWRHYIAIMAA 
ARHQCSYLVGSHMAEFLQTGGDPEWLLGI.HRAPEKLRKLSEINK 
LLAHRPWLITKEHIQALLKTGEHTWSIjAELIQALVIJLiTHCHSLS 
SFVFGCGILPEGDADGSPAPQAPTPPSEQSSPPSRDPLNNSGGF 
ESARDVEALMERMQQLOESIjLRDEGTSQEEMESRFELBKSESLL 
VTPSAD ILEPS PHPDMLCFVEDPTFGYEDFTRRGAQAPPTFRAQ 
DYTWEDHGYSIilQRLYPEGGQLLDEKFOAAYSLTYNTIAMHSGV 
DTSVLRRAIWN YI HCVFGIRYDD YD YGEVNQLLERNLKVYI KTV 
ACYPEKTTRRM YNLFWRHFRHSEKVII VNIJjLLEARMQAALL YAL 
RAITRYMT 


5544 


1895 


514 


lggllgrqrlllrmgagrlgapmerhgrasatsvssageqaagd 
pegrrqeplrrrassasvpavgasaegtrrdrlgsysgptsvsr 
qrveslrkkrplf pwfgld iggtlvklvyfepkditaeeeeeev 

ESLKS IRKYLTSNVAYGSTGI RDVKLELKDLTLCGRKGNliHFIR 
FPTHDMPAFIQMGRDKNPSSLHTVFCATGGGAYKFEQDFIiTIGD 
iiQLCKJLDEI/DCIf I KGIIiYI DS VGFNGRS QCY YFENPADSEKCQK 
LPFDLKNPYPLLLVNIGSGVSILAVYSKDNYKRVTGTSLGGGTF 
FGLCCLJLTGCTTFEEALEMAS RGDSTKVDKLVRDI YGGD YER FG 
LPGWAVASS FGNMMS KEKREAVS KEDLARATti I T I TNN IGS I AR 
MCALNENI NQVVFVGNFIJRINTIAMRLLAYALD YWS KGQLKALF 
SEHEGYFGAVGALLELUCIP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H»Histidine, I=Jsoleucine , K=Lysine, 
L=»Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\~possible nucleotide insertion) 


5545 


802 


131 


GAM W S AGRG G AAW P VIiLGLLLALLV PGGGAAKTGAEI* VTCGS VI* 
KLLNTHHRVRLHSIIDIKYGSGSGQQSVTGVEASDDANSYWRIRG 
GSEGGCPRGSPVRCX3QAVRLTHVLTGKNLHTHHFPSPLSNNQEV 
SAFGEDGEGDDLDLWTVRCSGQHWEREAAVRFQHVGTSVFLSVT 
GEQYGS PIRGQHEVHGMPSANTHNTWKAMEGI F I KPS VE PSAGH 
DEI* 


" 5546 


1592 


146 


FVPRGGHSSMGQSGRSRHQKRARAQAQLRNLEAYAANPHS FVFT 
RGCTGRNTRQLS LD VR R VME P LTASRLQ VRKKNS I*KD CVAVAG P 
LGVTHFLILSKTETNVYFKLMRLPGGPTLTFQVKKYSLVRDWS 
SLRRHRMHEQQFAHPPLLVLNSFGPHGMHVKLMATMFQNLFPSI 
NVHKVNI^IKRCLLIDYNPDSQELDFRHYSIKVVPVGASRGMK 
KLLQEKFPNMSRLQDISELLATGAGLSESEAEPDGDHNITELPQ 
AVAGRGNMRAQQSAVRLTE IGPRMTLQLI KVQEGVGEGKVMFHS 
FVS KTEEELQAI LEAKE KKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKSLEGMKKARVGGSDEEASGIPSRTASLEJLGEDDDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRIiAKS PGRKRKRWEMDRGRGRI* 
CDQ KFPKTKDKSQG AQARRGPRGAS RDGGRG RGRGRPG KR VA 


5547 


j 1592 


146 


FVPRGGHSS MGQSGRSRHQKRARAQAQLRNLE AYAAN PHS FVFT 
RGCTGRNIRQI*SLDVRRVMEPCTASRLQVRKKNSLKDCVAVAGP 
I^VTHFblLSKTETNVYPKLMRLPGGPTLTFQVKKYSLVRDVVS 
SLRPJIRMHEO^FAHPPI*I*Vl*NSFGPHGMHVKI*MATMFQNI*FPSI 
NVHKVT^I*NTIKRCLLIDYTTPDSQELDFRJfYSlKVVPVGASRGMK 
KLLQEKFPNMSRLQD1S ELLATGAGLSESEAEPDGDHNI TELPQ 
AVAGRGNMRAQQS A VRLTE I G PRMTLQI*I KVQEGVGEGKVMFHS 
FVS KTEEEXX?AIliEAKEKKliRI*KAQRQAQQAQNVQRKQEQREAH 
RKKSLEGMKKARVGGSDEEASGIPSRTASLELGEDDDEQEDDDI 
EY FCQA VGE AP SEDLFP EAKQ KRLAKS PGRKRKRWEMDRGRGRL 
CDQKFP KTKDKS QG AQ AR RG PRGASRDGGRGRGRGRPGKRVA 


5548 


1 


2153 


DQTGPPETIAFTFPRSTMEPLCPLLLVGFSLPLARALRGNETTA 
DSNETTTTSGPPDPGASQPLLAWLLLPLbLLIiLVLLLAAYFFRF 
RKQRKAWSTSDKKMPNGILEEQEQQRVMLLSRSPSGPKKYFPI 
PVEHLEEE IRI RSADDCKQFREEFNS LPSGHIQGTFEliANKEEN 
REKNRYPNII*PWDHSRVljLSOI*DGIPCSDYINASYIDGYKEKNK 
FIAAQGPKQETVNDFWRMVWEQKSATIVMLTNLKERKEEKCHQY 
WPDQG CWTYGN IR VCVEDCWI*VD YTI RKFCI QPQI*PDGCKAPR 
I»VSQI*HFTS WPDFGVPFTP I GMLKFLKKVKTLNPVHAGP I WHC 
S AGVGRTGTFI VIDAMKAMMHAEQKVDVFEFVSR IRNQR PQMVQ 
TDMQYTF I YQALI*EYYI*YGDTELDVSSLEKHI*QTMHGTTTHFDK 
IGLEEEFRKI*TirVRIMKE*^RTGNI*PA*SMKKAORVlQIIPYDF>n^ 
VII»SMKRGQEYTDYINASFIDGYRQKDYFIATQGPL*AHTVEDB*W 
RM I WEWKSHTI VM LTEVQB REQDKCYQ YWPTEGS VTHGE I T I E I 
KNDTI/SEAIS IRDFLVTLNQPQARQEEQVRWRQFHFHGWPE IG 
I PAEGKGMIDLI AAVQKQQQQTGNHP I TVHCSAG AGRTGTF I AI* 
SNILERVKAEGLLDVFQAVKSI/RLQRPHMVQTIiEQYEFCYKVVQ 
DFIDI FSDYANFK 


554 9 


915 


256 


FEATGGKR1^FKMAGTARHDREMAIQAKKKLTTATDPIERLRI*Q 
CLARGS AG I KGLGRVFR IMDDDNNRTLDFKEFMKGLNDYAWME 
KEEVEEltFQR FDKDGNGTI DFNEFLLTLRPPMSRARKEVI MQAF 
RKLDKTGDGVITIEDLREVYNAKHHPKYQNGEWSEEQVFRKFLD 
NFDSP YDKDGLVT PB EFMN YYAGVSAS IDTDVYFI IMMRTAWKL 


5550" 




1210 


RKRKVFLKMRRI^RKKTLSLVKELDAFPKVPESYVETSASGGTV 
SLIAFTTMALLTIMEFSVYQDTWMKYE YEVDKDFSSKLRIN I DI 
TVAWKCQY^ADV^T^E'rWASADGLWEPTVFDr^S PQQKEWQ 
RMLQLIQSRLQEEHSLQDVIFKSAFKSTSTALPPREDDSSQSPN 
ACRIHGHLYVNKVAGNFHlTVGKAIPHPRGHAHIiAALVNHESYN 
FSHRIDHLSFGELVPAIINPLDGTEKIAIDIiNQMFQYFITWPT 
KLHTYKISADTHQFSVTERERI INHAAGSHGVSGIFMKYDLSSI* 
MVTVTEEHMPFWQFFVRLCG I VGGI FSTTGMLHGIGKFI VE 1 1 C 
CRFRLGSYKPVNSVPFEDGHTDNHLPI*LENNTH 


5551 


211 


1700 


MQRDHTMDYKESCPS VS I PS SDEHREKKKRFTV YKVLVS VGRSE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, 2- 
Glutamic Acid, F= Phenylalanine, G=Glycine r 
H=Histidine, l=lsoleucine, K=l>ysine, 
^leucine, M=Methionine, N^Asparagine, 
P-Proline, Q^Glutamina, R=Arginine, 
S=Serine, T-Threonine, V^Valine, 
W=Tryptophan, Y»Tyroaine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








WFVFRRYAEFDKLYNTLKKQFPAMALKI PAKRIFGDNFEPDFI K 
QRRAGI^EFIQNLVRYPELYNHPDVRAFLQMDSPKHQSDPSEDE 
DERSSQKLHSTSQNINLGPSGNPHAKPTDFDFLKVIGKGSFGKV 
LLAKRKlilX3KFYAVKVLQKKIVIjNRKEQKHIt4AERPrVLLKlWKH 
PFLVGI*HYSFQTTEKliYFVLDFVNGGELFFHl»QRERSFPEHRAR 
FYAAEIASALGYLHS I KIVYRDLKPENILLDSVGHWLTDFGLC 
KEGIAISDTTTTFCGTPEYLAPEVIRKQPYDNTVDWWCLGAVLY 
EMLiYGIjP P FYCRDVAEMYDN I LHKPLSLR PGVSLTAWS ILEELL 
EKDRQNRLGAKEDFLE IQNHPFFESLSWADLVQKKI PPPFNPNV 
AGPDDIRNFDTAFTEETVPYSVCVSSDYSIVNASVLEADDAFVG 
FSYAPPSEDLFL 


" 5552 


2748 


93 0 


LGPAAGAAMGKKHKKHKAEWRSSYEDYADKPLEKPLKIiVLKVGG 
SEVTELSGSGHDSSYYDDRSDHERERHKEKKKKKKKKSEKEKHL 
DDEERRKRKE EKKRKREREH CDTEGEADDFDPG KKVE VE PP PDR 

IAPGYSMI I KHPMDFGTMKDKI VANEYKSVTEFKADFKLMCDWA 
MTYNRPDTVYYKLAKKI LHAG FKMMSKQAALLGNEDTAVEEPVP 
EWP VQVETAKKSKKPSRE V I S CM FEPEGNACSLTDSTAEEHVI* 
ALVEHAADEARDRINRFLPGGKMGYLKRNGDGSliliYSWNTAEP 
DADEEETHPVDLSSLSS KLLPG FTTLGFKDERRNKVTFLSSATT 
ALSMQNNSVFGDLKSDEMELLYSAYGDETGVQCAIiSIjQEFVKDA 
GS YS KKWDDLLDGI TGGDHS RTL FQL KQRRNVPM KP PDSAKVG 
m*T .on*; <; qvt .ppm Qn/ivcvDnvcvnT omt <; q t *czk\jvwi .r>prrnQ 

XI X JJUL/oooo V XjC»S L'torilO X trU V A V yjLOmidSlJUIV V AJVfi>Ul/IrUUi> 

HLNLDETTKLLQDLHEAQAERGGSRPSSNLSSLSNASERDQHHIi 
GSPSRLSVGEQPDVTHDPYEFLQSPEPAASAKT 


5553 


74 


1095 


LGREAVYLVS RMDG P VAEHAKQE PFHWT P1»1>ES WALS QVAGM P 
VFLKCENVQPSGSFKIRGIGHFCQEMAKKGCRHLVCSSGGNAGI 
AAA YAARKLG I PAT 2 VTjPESTS LQ WQRLQGEGAE VQLTGKVWD 
EANLRAQEIoAKRDGWENVPP FDH PLI WKGHASLVQELKAVLRTP 
PGALVLAVGGGGLLAG WAGLLEVGWQH VP I I AMETHGAHCFNA 
A I TAGKLVTL PDI TSVAKS LiGAKTVAARALECMQVCKI HSEWE 
DTEAVSAVO^LLDDER^VEPACGAALAAIYSGLL.RRLQAEGCIi 
PPSLTSWVIVCGGNNINSRELQALKTHLGQV 


5554 


" " 166 


2318 


CSGRTGGRGSLRPAE^TV'CijTCKLSGAETRGLlaCPALRTWIMKVlj 
GRSFFWVLFPVbPWAVQAVEHEEVAQRVIKLHRGRGVAAMQSRQ 
WVRDS CRKLSGLLRQKNAVIiWKIjKTA I GAVEKDVGLSDEEKLFQ 
VHTFEI FQKRLNESFJSISVFQAVYGI^RAIjQGDYKDVVNMKESSR 
QRLEALREAAI KEETE YMEIXAAEKHQVEALKNMQHQNQSLSML 
DEII»EDVRKAADR1»EEEIEEHAFDDNKSVKGVNFEAVLRVEEEE 
ANS KQNI TKREVEDDLGIiSMLIDSQNWQ YILTKPRDSTI PRADH 
HF I KD I VT IGMIiS L PCG WLCTAIGLPTM FGYI I CGVLLGPSGLN 
SIKSJVQVETLGEFGVFFTLF^VGLEFSPEKLRKVWKISLOGPC 
YMTLLMI AFGLLWGHtiLRIKPTQSVF I STCLSIiSSTPLVSRFLM 
GSARGDKEGD I DYS TVLLGMI*VTQDVQliGLFKAVM PTLIQAGAS 
ASSS I WEVLR ILVLIGQILFSLAAVFLLCLVI KKYL1GPYYRK 
LHMES KGNKE I L I LG I SAF I FLMLTVTELLD V£ MELGCFLAGAL 
VSSqGPWTEEIATSIEPIRDFLAIVFFASIGLHVFPTFVAYEL 
TVLVFIjTLSVVVMKFLLAAL VL5I»I LPRSSQYI KW I VSAGLAQ V 
S3FSFVLGSRARRAGVISREVYLLILSVTTLSLLLAPVLWRAAI 
TRCVPRPERRSSIi 


555S 


212 


1425 


LSLRTRETPAPPRCEAASQGRVGWRADAAAEEAVRSVWNRTRDR 
GTmPQNLSTFCLLIxL YLIGAV JAGRDFYKILG VPRSAS ZKDXK 
KAYPJCIiAI^LHPDRNPDDPQAQEKFQDIjGAAYE^TIjSDSEKRKQY 
DTYGEEGLKDGHQSSHGDI FSHFFGDFGFMFGGTPRQQDRNI PR 
GSDI IVDLEVTLEEVYAGNFVEVVRNKPVARQAPGKRKCNCRQE 
MRTTQLGPGRFQMTQEVGCDECPNVTGjVWEFJ?T1>EVEIEPGVRD 
GMEYPF IGEGEPHVDGEPGDLRFRI KWKHPI FERRGDDLYTNV 
TISLVBS hVGFEMDITlilJXSHKyHXSRDKITRPGAKLWKKGEG I> 
PKFDNNNI KGSLI I TFDVDFPKEQLTEEAREGIKQLLKQGSVQK 
VYNGLOGY 


5556 


5835 


j 3346 


RTRGMSKNCVPMEFEEYLLRMFOGTFYLLQKITKDNNAHTVKSR 
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CPA " " 

ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
correspond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
recidue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
tfUAlanine, (^Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, M=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown f *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEELDESYIEKFTDFLRLFVSVHLRRIBSYSQFPWEFLTLtFK 
YTFHQPTHEGYFSCLDIWTLFLDYLTSKIKSRLGDKEAVLNRYE 
DALVLLLTEVLNRIQFRYNQAQLEELDDETLDDDQQTEWQRYLR 
QS LE WAKVMELLPTHAFSTLPPVLQDNLE VYLGLQQFI VTSGS 
GHR LNI TAENDCRRLH CS LRDLS SLbQAVGRIiAEYFI GDVFAAR 
FNDALTVVE RL VKVTL YG SQ I KI*YN IETAVP S VLKP DI* I DVHAQ 
SLAALQAYSHWIiAQYCSEVHRQNTQQFVTLI STTMDA I TPLIST 
KV QDKLLLSA CHliLVS LATTVRP VFLI S I PAVQKVFNRI TDASA 
LRLVDKAQVLVCRALSNILLLPWPNLPENEQQWPVRSIl^IHASIil 
SALSRDYRNLKPSAVAPQRKMPLDDTKLIIHQTLSVI>EDIVENI 
SGESTKSRQICYQSLQESVQVSLALFPAFIHQSDVTDEMLSFFL 
TLFRGLRVQMGVPFTEQIIQTFIiNMFTREQLAESILHEGSTGCR 
VVEKFDKILQVVVQEPGQVFKPFLPSI IALCMEQVYP 1 1 AERPS 
PDVKAELFELLFRTLHHNWRYFFKSTVXiASVORGIAEEQMENEP 
QFSAIMQAFGQSFLQPDIHLFKQNLFYLETLNTKQKLYHKKIFR 
TAMLFQFVNVLLQVIiVHKSHDLLQEElGIAIYNMASVDFDGFFA 
APLPEFLTS CDGVDANQKSVIX5RNFKMDRVRRERGRAKRRAEWA 
RKPGTCAARRGHIEASGRGLCPPCSLAAAHEMPADLVL 


5557 


1712 


491 


VI LGAGLRDKDMWI P WGbPRRLRLSAI*AGAGRFCILGSEAATR 
KHI/PARNHCGLSDSSPQLWPEPDFRNPPRKASKASLDFKRYVTD 
RRLAETLAQ I YLGKPS RP PHLLLECNPGPG I LTQ ALLEAG AKW 
ALE SDKT F I PHLES LG KNLOGKLR V I HCDFFKLDPRS GG V I KP P 
AMSSRGL FKNLG IE A VPWTADI PLKWGMFPSRGEKRAL WKLAY 
DLYSCTSIYKFGRIEVNMFIGEKEFQKLMADPGNPDLYHVLSVI 
WQLACEIKVLHMEPWSSFDIYTRKGPLENPKRRELLDQLQQKLY 
LI QMI PRQNIiFTKNIiTPMN YN I FFHLLXH CFGR R5 AT VI DHLRS 
LTPLDARDI LMQIGKQED2KWNMHPQDFKTLFETIERS KDCAY 
KWLYDETLEDR 


5558 


1509 


96 


RAGCTH PQ VPADLGA PAE PRR PQKTCt/CLLQPQPGGQRG PTTM I 
TGVFS MRLWTPVG VLTSLAYCLHQRRVALAELQE ADGQ C P VDRS 
LLKLKMVQWFRHGARSPLKPLPLEEQVEWNPQLLEVPPQTQFD 
YTVTNLAGGPKPYSPYDSQYHETTLKGGMFAGQLTKVGMQQMFA 
LGERLRKNYVE D I PFLSPTFNPQ E VF I RSTN I FRNLE S TRCLLA 
GLFQCQKEGP 1 1 IHTDEADSEVLYPNYQSCWS LRQRTRGRRQTA 
ShQPG I SEDLKKVKDRMG I DSSDKVDFFILLDNVAAEQAHNLPS 
CPMLKRFARMIEQRAVDTSLYI LPKEDRESLQMAVGPFLH I LES 
NLLKAMDSATAPDKI RKLYLYAAHDVTFI PLLMTLGI FDHKWPP 
FAVDLTMELYQHI.ESKEWFVQLYYHGKEQVPRGCPDGLCPLDMF 
LNAMSVYTLSPEKYHALCS QTQVMSVGNEE 


5559 


150 


1983 


PLAATAHFAKMSRVAKYRRQVSEDPDI DSLLETLS PEEMEELEK - " 
ELDVVDPDGSVPVGLRQRNQTEKQSTGVYNREAMLNFCEKETKK 
LMQREMSMDESKOVETKTDAKNGEERGRDASKKALGPRRDSDLG 
KEPKRGGIiKKSFSRDRDEAGGKSGEKPKEEKI IRG IDKGRVRAA 
VDKKEAGKDGRGEERAVATKKEEEKKGSDRNTGLSRDKDKKREE 
MKEVAKKEDDEKVKGERRNTDTRKEGEKMKRAGGNTDMKKEDEK 
VKRGTGNTCTKKDDEKVKKNEPLHEKEAKDDSKTKTPEKQTPSG 
PTKPSEGPAKVEEEAAPS I FDEPLERVKNNDPEMT3VNVNNSDC 
ITOEILVRFTEALEFNTWKLFALANTRADDHVAFAIAIMLKAN 
KT ITS LN LDSNH I TG KG I I*AI FRALLQNNTLTELR FHNQRHI CG 
GKTEME IAKLLKENI^LLKLGYHFELAGPRNTVTNLiLSRNMDKQ 
RQKRLQEQRQAQEAKGE KKDLLE V P KAG AVAKG S PKPS PQ PS P K 

GDKVLPAQEKNSRDQLLAA I RSSNLKQLKKVEVP KLLQ 


55S0 
5S61 


9 

2175 


921 
1775 


SSWEFSALSVSMACLSPSQLQKFQQDGFLVLEGFLSAEECVAM 
QQRIGEIVAEMDVPLHCRTEFSTQEEEQIjRAQGSTDYFIjSSGDK 
1 RFFFEKGVFDEKGMFLVPPEKS INKIGHALHAHDPVFKS ITHS 
FKVQTLARSLGLQMPWVQSMYI fkqphfggevs phqdas flyt 
EPLGRVLGVWIAVEDATLENGCLWFI PGSHTSGVSRRMVRAPVG 
SAPGTSFLGSEPARDNSLFVPTPVQRGALVLIHGEWHKSKQNL 

sdrsrqaytfhlmeasgttwspenwlqptaelpfpqlyt 

CSfFIFQFFSSPYPGLHPHQTPAPLPNPGLYPPPVSMSPGQPPPQ 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=*Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H»Histidine, I^Isoleucine, X=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P- Proline, Q=G1 ut amine , R=Arginine, 
S=Serine , T -Threonine , v= Valine , 
K=Tryptophan, Y=Tyrosine, X -Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








QLIiAPTYFSAPGVKNFGNPSYPYAPGALPPPPPPHLYPNTQAPS 
QVYGGVTYYNPAQQQVQPKPSPPRRTPQPVT-KPPPPEWSRGS 
S 


5562 


342 


1385 


SSGKWDMAAAG^GIiVRGLKAGVLSQADYLNbVQCETLEDLKLH 
LQSTD YGNFLANEAS PLTVSVIDDRLKEKMWE FRHMRNHAYEP 
LAS FLDFITYSYMIDNVILLITGTLHQRS I AELVPKCHPLGS FE 
QMEAVNIAQTPAEL YNAIIiVDTPLAAFFQDCI S EQDLDEMN I E I 
IRNTL YKAYLE S FY KFCTLLGGTTADAMCP I LE FEADRRAF I IT 
INSFGTELSKEDRAKLFPHCGRLYPEGIiAQLARADDYEQVKNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNQFHF 
GVF YAFVKLKEQECRN I VW I AEC I AQRHRAKI DNY IPIF 


5563 


342 


1385 


SSGKNDMAAAGAAGLVRGLKAGVLSQADYLNLVQCETLEDLKLH 
LQSTDYGNFLANE ASPLT VS VI DDRLKEKM WE FRHMRNHAYEP 
LAS FLDFI TYS YM IDNVI LLI TGTLHQRS I ABLVP KCHPLGS FE 
QMEAVNIAQTPAEH.YNAI LVDTPLAAFFQDC I S EQDLDEMN I EI 
IRNTL YKAYLES F YKFCTLLGGTTADAMCP I LE FEADRRAF I IT 
INSFGTELSKEDRAKLFPHCGRLYPEGLAQLARADDYEQVKNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNQFHF 
GVFYAFVKLKEQECRNIVW I AEC I AQRHRAKIDN Y I PI F 


5564 


3 


914 


RVRRDKRAVWTARGRRRCGDSMSGGWMAQVGAWRTGALGLALLL 
LLGLG LGLEAAAS PLSTPTSAQAAGPSSGSCP PTKFQCRTSGLC 
VPLTWRCDRDLDCSDGSDEEECRIEPCTQKGQCPPPPGLPCPCT 
GVS DC SGGTDKKLRNCSRLACLAGELRCTLSDDC I PLTWRCDGH 
PDCPDSSDELGCGTNEILPEGDATTMGPPVTLESVTSLRNATTM 
GPPVTLES VPS VGNATS SSAGDQSGS PTAYGV I AAAAVLS AS LV 
TATLLLLSWLRAQERLRPLGLLVAMKESLLLSEQKTSLP 


5565 


993 


138 


RWNS PNPARAGS I SRPQRAPGS VSAVAMTAAVFFGCAFIAFGPA 
LALYVFTI ATE PLR 1 1 FL I AG AFFWLVSLLI SSLVWFMARVI ID 
NKDGPTQKYLLI FGAFVSVYIQEMFRFAYYKLLKKASEGLKS IN 
PGETAPSMRLLAYVSGLG FG IMSGVFS FVNTLSDSLGPGTVGIH 
GDSPQFFLYS AFMTLVI ILLHVFWGI VFFDGCEKKKWGILL I VL 
LTHLLVSAQTF I S S YYG I NLASAFI I LVLMGTWAFLAAGGSCRS 
LKLCLLCQDKNFLLYNQRSR 


5566 


2043 


1232 


SHIQHHGRGAQAPVKMVSWMISRAWLVFGMLYPAYYSYKAVKT 
KNVKE Y W WMMYW I VFAL YT VI ETVADQT VAWF P LY YELKI AFV 
IWLLS PYTKGASLI YRKFLHPLLSSKEREIDDY I VQAKERGYET 
MVNFGRQGLNLAATAAVTAAVKSQGAI TERLRSFSMHDLTTI QG 
DEPVGQRPYQPLPEAKKKSKPAPSESAGYGIPLKDGDEKTDEEA 
EGPYSDNEMLTHKGPRRSQSMKSVKTTKGRKEVRYGSLKYKVKX 
RPQVYF 


5567 


1554 


233 


E FLGS GVS PDIjANEDGJLTALHQCCIDDFREM VQQLLEAGANI NA 
CDSECWTPLHAAATCGHLHLVELLIASGANLIiAVNTDGNMPYDL 
CDDEQTLDCLETAMADRGITODS IEAARAVPELRMLDDIRSRLQ 
AGADLiHAPLDHGATLLHVAAANGFSEAAALLLEHRASLSAKDQD 
GWEPLHAAAYWGQ VP LVELLVAHGADLNAKS LMDET PLDVCG D E 
EVRAKLIiELKHKHDALLRAQSRQR SLLRR RTSSAGSRGKWRRV 
SLTQRTDLYRKQHAQEAIVWQQPPPTSPEPPEDNDDRQTGAELR 
P PPPEEDNPEWRPHNGRVGGS PVRHLYSKRLDRSVS YQLSPLD 
STT PHTLVHDKAHKTIADLKRQRAAAKLQRP P PEGPES PETAE P 
GLPGDTVTPQPDCGFRAGGDPPLLKLTAPAVEAPVERRPCCLLM 


5568 


1731 


587 


AEDROPASRRGAGTTAAMAASGPGCRSWCLCPEVPSATFFTALL 
SLLVSGPRLFLLQQPLAPSGLTLKSEALRNWQVYRLVTYIFVYE 
NP ISLLCGAI II WRFAGMFERTVGTVRHCFFTVI FAIFSAI IFL 
S FEAVSS LS KLGEVEDARGFTPVAFAMLGVTT VRSRMRRALVFG 
MVVPSVLVPWLLLGASWLIPQTSFLSNVCGLSIGLAYGLTYCYS 
IDI^ERVALKLDQTFPFSLMRRISVFKYVSGSSAERRAAQSRKL 
NPVPGSYPTQSCHPHLSPSHPVSQTQHASGQKLASWPSCTPGHM 
PTLP P YQPASGLCYVQNHFGPNPTSSSVYPASAGTSLG IQPPTP 
VWSPGTVYSGALGTPGAAGSKES SRVPMP 


5569 


2 


835 


QTPCPLAWERGSRSEDISVPGQKPPTCSSFSGMDVGPSSLPHLG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Se rine, T=Threonine, v=valine, 
W=Tryptophan, Y=Tyrosine, X=Unkno*m, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKLLLLLLLLPLRGQANTGCYGIPGMPGLPGAPGKDGYDGIjPGP 
KGEPGIPAIPGIRGPKGQKGEPGLPGHPGKNGPMGPPGMPGVPG 
PMG I PGEPGEEGRYKQKFQS VFTVTRQTHQPPAPNSJUIRFNAVL 
TNPQGDYDTSTGKFTCKVPGLYYFVYHASHTANLCVLLYRSGVK 
WTFCGHTSKTNQWSGGVLLRI^>VGESVWI*AVWDYYDMVGIOG 
SDSVFSGFLLFPD 


5570 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT" 
MSS PS PGKRRMDTD WKL I ES KHE VTI LGGLNE FWKF YG PQGT 
P YEGG VWKVRVDLPDK Y P FKS PS IG FMNKI FH PN I DE AS GTVCL 
DVINQTOTALYDLTNXFESFLPQLLAYPNPIDPLNGDAAAMYLH 
RPEBYKQKIKEYIQKYATEEALKEQEEG1X3DSSSESSMSDFSED 
EAQDMEL 


5571 


264 


946 


RDRRDRGG VATSTE E PAR PRAPQS RG PGP VSQTGRGRERGGGDT 
MS S PS PGKRRMDTD WKL I ES KHE VTI LGGLNE FWKFYGPQGT 
PYEGGVWKVRVDLPDKYPFKS PS I G FMNKI FH PN I DE ASGTVCL 
D V INQTWTALYDLTN I FES FLPQLLA Y PNP IDPLNGDAAAMYLH 

RPEEYKQFCIKEYIQKYATEEAIjKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5572 


2802 


2085 


RTDYRTGI PGRRFRVMAAGDGDVKLGTLGSGSESSNDGGSESPG " 

DAGAAAEGGGWAAAALALLTGGGEMLLNVAI>VALVLLiGAYRLWV 

RWGRRGLGAGAGAGEESPATSLPRMKKRDFSIiEQLRQYDGSRNP 

RI LLAVNGKVFDVTKGS KF YG PAG P YG I FAGRDAS RGIiATFCl»D 

KDALRDEYDEJLSDLNAVQMESVREWEMQFKEKYDYVGRLLKPGE 

EPSEYTDEEDTKDHNKQD 


5573 


2562 


219 


VPARTPNAEDQGPEARAATATPCQSGGRERAGEAAEDGVKMAAF 
S EMG VMPE I AQAVE EMDWLL PTD 1 QAES I PL I LGGGDVLMAAET 
GSGKTGAFS I PVI QI VYETI*KDQQEGKKGKTTI KTGASVLNKWQ 
MNP YDRGS AFA I G SDGLCCQ S REVKE WHGCRATKGLMKGKH Y YE 
VSCHDQGLCRVGWSTMQASLDLGTDKFGFGFGGTGKKSHNKQ5X> 
NYGEEFTMHDT IGCYLDI DKGHVKFS iCNGKDLGLAFEIPPHMKN 
QAIiFPACVbKNAELKFNFGEEEFKFPPKDGFVALS KAPDG Y I VK 
SQHSGNAQVTQTKFLPNAPKALIVEPSRELAEQTLNNIKOFKXY 
IDNPKLRELLIIGGVAARDQLSVLENGVDIWGTPGRIjDDLVST 

gkln^qvrflvldeadgllsqgysdfinrmhnqipqvtsdgkr 
lqvivcsatlhsfdvkklsekimhfptwvdlkgedsvpdtvhhv 

WPVNPKTDRLWERLiGKSH IRTDDVHAKDNTR PGANS PEMWSEA 
I KILKGE YAVRAI KEHKMDQAII fcrtkidcdnleqyfiqqggg 
pdkkghqfscvclhgdrkpherkqnlerfkkgdvrflictdvaa 
RGidihgvpyvinvtlpdekqnyvhrigrvgraermglaislva 

TEKE KVW YHVCS S RGKGCYNTRLKEDGGCTI WYNEMQIJLS E I EE 

hi^ctisqvepdikvpvdefdgkvtygqkraagggsykghvdil 
aptvqelaalekeaqtsflhlgylpnqlfrtf 


5574 


1731 


952 


NEGLEVFKEQELQP3DKGAVPEDASTERSAMASLGLQLVGYILG ' 

llgllgtlvamllpswktss yvgas ivtavgfskglwmecaths 

TGXTQCDIYSTLLGhPADIQAAQmMVTSSAXSSLACIXSVVGM 

rctvfcqesrakdrvavaggvffilggllgfipvawnlhgilrd 
fysplvpdsmkfeigealylgi isslfsli agi ilcfscscqrn 
rsnyydayqaqplatrss prpgqppkvksefns ysltgyv 


5575 


456 


766 


llwal pcp pptaaavlls s tglmellekmlaltlakads p rtal 
lcsawlltas fs aqqhkgs lqkdplls qacvgclealldyldar 


5576 


249 


2146 


RSWGAPWFWRMRLLRRRHMP LRLAMVGCAFVLFLFLLHRDVSSR 
EEATEKPWLKSLVSRKCHVLDLMLEAMNNLRDSMPKLQIRAPEA 
QQTLFSINQSCLPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 
S KWTPLETQEKEEG YKKHCFNAFASDR I S LQRS LGPDTRP PECV 
DQKFRRCPPLATTSVIIVFHNEAWSTLLRTVYSVLHTTPAILLK 
EIILVDDASTEEHIiKEKLEQYVKQLQWRWRQEERKGI.ITARIj 
LGASVAQAE VLT FLDAHCEC FHGWLEPLLARI AEDKTWVS P D I 
VTIDLNTFEFAKPVQRGRVHSRGNFDWSLTFGWETLPPHEKQRR 
KDETYP Z KS PTFAGGLFSI SKSYFEHIGTYDNQME I WGGENVEM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E- 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
Ii=Leucine, M=Methionine, N=Asparagine, 
PsProline, Q=Glutamine, R=sArginine, 
S>=Serine, T=Threonine , V^Valine, 
W«=Tryptophan, Y=Tyrosine, X=Unkxiown, *=Stop 
Codon, /=pcssible nucleotide deletion, 
\=possible nucleotide insertion) 








SFRVWQCGGQLE1 1 PCSWGHVFRTKSPHTFPKGTSVI ARNQVR 
LAEVWMDSYKKIFYRRNLQAAKMAQEKSFGDISERLQLREQLHC 
HNFSWYLHNVYPEMFVPDLTPTFyGAIKNLGTNQCIiDVGENWRG 
GKPLIMYSCHGLGGNQYFEYTTQRDLRHNIAKQLCIiHVSKGALG 
LGSCHFTGKNSQVPKDEEWELAQDQIiIRNSGSGTCLTSQDKKPA 
MAPCNPSDPHQLWLFV 


5577 


3 


1275 


RNSDCS CGEISVHCXjPWVLF I LDLKVES SMFCPI>K1»II,LP VLLD 
YSLGLNDLWSP PELTVHVGDS ALMGCVFQSTEDKCI FKIDWTli 
S PGEHAKDE Y VJjY YYSNLSVP I GRFQNR VHLMGD I LCNDG SLLL* 
QDVQEAD^ITICEIRLKGESQVFKKAVVLHVLPEEPKELMVHV 
GGLIOMGCVFQSTEVKHVTKVEW I FSGRRAKEEI VFRYYHKLRM 
SVE YSQSWGHFQNRVNLVGDI FRNDGS IMIiQGVRESDGGNYTCS 
IHLGNLVFKKTIVLHVSPEEPRTLVTPAALRPLVIjGGNQLVIIV 
GIVCATILLL.PVL1IJIVKKTCGNKSSVNSTVLVKNTKKTNPEIK 
EKPCHFERCEGEKHIYSPIIVREVIEEEEPSEKSEATYMTMKPV 
WPSLRSDRNNSLEKKSGGGMPKTQQAF 


5578 


3 


783 


AVESMASPGAGRAPPELPERNCGYREVEYWDQRYQGAADSAPYD 
VIFGDFSSFRAIjLEPEIjRPEDRIl»VIjGCGNSAIiSYELFljGGFPNV 
TSWYSSVWAAMQARYAHVPQLRWETMDVRKLDFPSASFDVVIi 
EKGTLDAIil»AGERDPWTVSSEGVHTVDQVl»SE VSR VL»V PGGRF I 
SMTSAAPHFRTRHYAQAYYGWSIiRHATYGSGFHFHLYLMHKGGK 
LSVAQLALGAQIbSPPRPPTSPCFJ^QDSDHEDFLSAIQL 


5579 


3 


1540 


" " RNSGLARGAS AIARHGGGIAGGVGWDCGACASRCQGVMEGLLTR 
cral.palatcsrqlsgyvpcrfhhcaprrgrrlijlsrvfqpqnr> 
REDRVLSLQDKSDDIjTCKSQRLMLQVGL i ypaspgc yhllp ytv 
RAMEKLVRVIDQEMQAIGGQKVNMPSLSPAEliWQATNRWDLMGK 
ELLRLRDRHGKS YCI*G PTHEEAI TAIi IAS Q KKI>S Y KQL P FI*L YQ 
VTRKFRDEPRPRFGLLRGREFYMKDMYTFDSSPEAAQQTYSLVC 
DA YCS LFN KLGI* PFVKVQADVGT I GGTVSHE FQLPVDI GEDRLA 
ICPRCSFSANMETLDLSQMNCPACQGPLTKTKGIEVGHTFYLGT 
KYSSIFNAQFTNVCGKJ>TLJ^KGCYGI^VTRILAAAIEVLSTED 
CVRWPSLLAP YQACIjI PPKKGS KEQAASEIi I GQL YDH I TEAVPQ 
UIGEVXiLDDRTHLTIGNRLKDANKFGYPFVIIAGKRAIJSDPAHF 
EWCQNTGBVAFLTKDGVMDLLTPVQTV 


5580 


1681 


450 


ADAGTRCI PGFWPSGAGYSAPAQRGRRSSGRMRAAAAPGtiTAP 
WRLLQCCELEAGELGMAVPAAAMGPSALGQSGPGSMAPWCSVSS 
G PS RYVLGMQEL FRGHS KTREFLAHSAKVHS VAWSCEGRRI*ASG 
SFDKTASVFLLEKDRLVKENNYRGHGDSVDQLCWHPSNPDI>FVT 
ASGDKTIR I WD VRTTKC I ATVNTKGEN INI CWS PDGQTIAVGNK 
D DWT F I DAKTHRS KAE EQ F KFEVNE I SWNNDNNMFFI/TNGNGC 
INIIjS YPELKPVQS INAHPSNCI C I KFDPMGKYFATGS ADALVS 
LWDVDELVCVRCFSRLDWPVRTLSFSHDGKMIiASASEDHFIDIA 
EVETGDKLWEVQCESPTFTVAWHPKRP3ULAFACDDKDGKYDSSR 
EAGTVKIiFGLPNDS 


5581 


54 


947 


GGGSGPRAP S ATItLDTGES VAAVASG EDKG I AASAAAAAVFACS 
CSPDPQSSTMNPVYSPVQPGAPYGNPKNMAYTGYPTAYPAAAPA 
YN PSLYPTNS PS YAPEFQFLHSAYATIiLMKQAWPONSSS CGT2G 
TFHLPVDTGTENRTYQASSAAFRYTAGTPYKVPPTQSNTAPPPY 
SPSPNPYQTAMYPIRSAYPQQNLYAQGAYYTQPVYAAQPHVIHH 
TTVVQPNSIPSAIYPAPVAAPRTNGVAt4GMVAGTTMAMSAG*n»L 
TTPQHTAIGAHP VSMPTYRAQGTPAYS YVPPHW 


5582 


5775 


2739 


I ITHNNNVI IPIiVIAYHLSGSAQARGERSPAERLMERQKRKADI 
EKGLQFIQSTLPLKQEEYEAFTjLKLVQNIiFAEGNDLFREKDYXQ 
ALVQYMEGIJWADYAASDQVALPRELLCKLHVNRAACYFTMGLY 
EKALEDSEKALGLDSESIRALFRKARALNELGRHKEAYECSSRC 

si^phdesvtqlgqelaqklglrvrkaykrpqeletfsllskg 
taagvadqgtsnglgsiddietdcyvdprgspallpstptmplf 
phvldllapldssrtlpstdsiiddfsdgdvfgpeldtiiijdslsd 

VQGGLSGSGVPSEIjPQLiI P VFPGGTPLLPPWGGS I PVSSPLPP 
ASFGIiVMDPSKKLAASVIjDAIiDPPGPTLDPIjDLLPYSETRIiDAIj 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H«Histidine, I^Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, Tr-Threonine, V=Valine, 
W=Tryptophan, Y=*Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possibXe nucleotide insertion) 








DS FGSTRGSLDKPDS FMEETNSQDHRP PSGAQKPAPS PEPCMPN 
TALbl KN P1AATHEFKQACQLCYPKTGPRAGDYT YREGLEHKCK 
RDILLGRLRSSEDQTWKR IRPRPTKTS FVGS Y YLCKDMINKQDC 
KYGDNCTFAYHQEEIDWTEERKGTLNRDLLFDPI1GGVKRGSI.T 
IAKLLKEHQGI FTFLCE I CFDSKPRI IS KGTKDSPS VCSNLAAK 
HSFYNNKCliVHIVRSTSIiKYSKIRQFQEHFQFDVCRHEVRYGCL 
REDSCMFAHSFIELKVWLLQQYSGMTHEDIVQESKKYWQQMEAH 
AGKASSSMGAPRTHGPSTFDLQMKFVCGQCWRNGCWEPDKDLK 
YCSAKARHCWTKERRVLLVMS KAKRKWVS VR PLPS I RNFPQQYD 
LC1KAQNGRKCQYVGNCS FAHSPEERDMWTFMKENK3 LDMQQTY 
DMWLKKHNPGKPGEGTPISSREGEKQIQMPTDYADIMMGYHCWIi 
CGKNSNSKKQWQQHIQSEKHKEKVFT3DSDASGWAFRFPMGEFR 
LCDRLQKGKACPDGDKCRCTtfiGQEELNEWLDRREVLKQIOiAKAR 
KDMLLCPRDDD FG KYNFLLQEDGD IJW3ATP EAPAAAATATTGE 


5583 


3 


1265 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE 
IKKAYRKLALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 
QGGEQAI KEGGSGS PS FS S PMDI FDMFFGGGGRMARERRGKNW 
HQLSVTLEDLYNGVTKKLALQKNVICSKCEGVGGKKGSVEKCPIi 
CKGRGMH1H1QQIGPGMVQQIQTVCIECKGQGER1NPKDRCESC 
SGAKVI REKKI I E VH VE KGMKDGQ KI LFHGEGDQEPELEPGDV I 
IVLDQKDHSVFQRRGHDLIMKMKIQLSEALCGFKKTIKTLDNRI 
LVITS KAGEVI KHGDLRCVRDEGMP I YKAPLEKGI LI IQFLVI F 
PEKHWLSLEKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5584 


3 


1265 


SSG CRQGR PGRSDRPRP P PRRHKMV KETRY YD I LG VKPS AS P EE ' 
I KKA YRKLALKYH PDKNPDEGEKFKLI SQAYE VLS DP KKRDVYD 
QGGEQAI KEGGSGS PSFS S PMD IFDM F FGGGGRMARERRG KNW 
HQLS VTLEDLYNG VTKKLALQKNVI CEKCEGVGGKKGS VEKCPL 
CKGRGMHIHIQQIGPGMVQQIQTVCIECKGQGERINPKDRCESC 
SGAKVIREKKI IE VHVEKGMKDGQKI L FHGEGDQEPELEPGDVI 
I VLDQKDHSVFQRRGHDLI MXMKIQLS EALCGFKKTIKTLDNRI 
LVI TS KAGEVI KHGDLRC VRDEGMP I YKAPLE KG I LI IQFLVI F 
PEKHWLSLEKLPQLEALLP PRQKVR I TDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA | 


5585 


2619 


915 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM 
YHS LTYATILEMQAMMTFDPQDILLAGNMMKEAQ'M lcqrh rrks 
SVTDSFSSISVKRPTLGQFTEEBIJiAEVCYAKCLLQRAALTFLQD 
ENMVS F I KGG I KVRNS YQT YKELDSLVQS SQYCKGSMH PH FEGG 
VKLGVGAFNLTLSMLPTRILRLLEFVGFSGNKDYGLLQLEEGAS 
GHSFRSVLCWLLLCTHTFLTFVLGTGNVNIEEAEKLLKPYLNR 
YPKGAIFLFLAGRIEVIKGNIDAAIRRFEECCEAQQHWKQFHHM 
CYVJELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFS SN PI SLP VPALEMM YI WNG YAV I GKQPKLTDGILE 1 1 TK 
AEEKLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
ISANEKKIKYBH YLI PNALLELALLLMEQDRNEEAI KLLESAKQ 
NYKNYSMESRTHFR I QAATLQAKSSLENS SRSMVS SVS L 


5586 


2619 


915 


LPAGTPE S SLHEALDQCMTALDL FLTNQFSEALS YLKPRTKESM 
YHSLT YAT I LEMQAMMTFDPQD I LLAGNMMKEAQMLCQRHRRKS 
SVTDS FS SLVNR PTLGQFTEEE I HAE VC YAKCLLQRAALTFLQD 
ENMVSFIKGGIKVRWSYQTYKELDSLVQSSQYCKGENHPHFEGG 
vjumvimu viLiXU&rujt' XK±JjK-UJjE.c w\jt ovjWKJJ rCaJbLOLiEECiAS 
GHS FRS VLCVMLLLC YHTFLT FVLGTGl^VN I E EAE KLLKP YLNR 
YPKGAI FLFLAGR I E VI KGN I DAAI RR FEE CCEAQ QHW KQFHHM 
CYWELMVICFTYKGQWKMSYFYADLLSKENCWSKATYI YMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RR YFSSNP I SLPVPALEMMYI WNG YAVIGKQPKLTDG I LEI I TK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
ISANEKKIKYDHYLIPNALLELALLLMEQDRNEEAIKLLESAKQ 
NYKNYSMESRTHFR I QAATLQAKSSLENSSRSMVSSVSL 


5587 


1768 


148 


SSAVPDGAVGRPVAVAVGGPPHSCRCRPCCLMAAIGVHLGCTSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C=Cysteine, D=Aspartic Acid. E= 
Glutamic Acid. F=Phenylalanine , G=Glycine, 
H^Histidine, I-Isoleucine, K=I»ysine # 
I>=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreonine, V=Valine, 
W=Tryptophan f Y^Tyrosine, X= Unknown. *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








C VA V YKDGRAG VVAND AGDRVTPA WAYS ENE E X VGLAAKQ SRI 
RNISNTVMKVKQILGRSSSDPQAQKYIAESKCI.VIEKNGKLRYE 
IDTGEETKFVKPEDVARL.IFSKMKETAHSVLGSDANDWITVPP 
DFGE KQKNALGEAARAAG FNVLRL I HE ? SAALIAYGIGQDS PTG 
KSNILVFKLGGTSLSXjSVMEVNSGIYRVLSTNTDDNIGGAHFTE 
TJjAQ YLAS EFQRS F KHDVRGNARAMM KI/TNS AE VAKHSLS TLGS 
ANCFLDSLYEGQDFDCNVSRARFELLCSPIiFNKCIEAIRGLLDQ 
NGFTADDINKWLCGGSSR I PKLQQLIKDLFPAVELLNS I P PDE 
VI PIGAAI EAG 1 1* I GKENLLVEDSLM I ECS ARD I L VKG VDE SGA 
SRFTVLFPSGTPI»PARRQHTLQAPGSISSVCLEt>YESDGKNSAK 
EETKFAQVVLQDLDKKENGLRDIIAVLTMKRDGSLHVTCTDQET 
GKCEAISIEIAS 


5588 


3 


589 


XPPP PEQAMVAATVAAAWLIjLWAAACAQQEQDFYDFKAVNI rgk 
LVS LE KYRGS VS LVVNVAS ECG FTDQH YRALQQLQRDIX3PHHFN 
VIAFPCNQFGQQEPDSNKEIESFARRTYSVSFPMFSKIAVTGTG 
AHPAFKYLAQTSGKEPTWNFWKYLVAPDGKWGAWDPTVSVEEV 
RPQI TAiVRKLI LLKREDL 


5589 


1884 


553 


LRQAWHEGGIGQTDKERGAAA1.PGEEGDPTRGRSLGRASWESGS 
PRRPRSPFS SFLPRPI CLSLEARPCS I EDRRNWSlilGRPGAPAS 
GbmSSGLMU3PDRCRPRSRCSCRVMENPSPAAALGKAIiCALI,L» 
ATLGAAGQPLGGES I CS ARAPAKYS ITFTGKWSQTAFPKQYPLF 
RPPAQWS S ItLGAAHSSD YS MWRKNQY VSNGLRDFAERGEAWALM 
KE I E AAGE ALQS VHAVFSAP AVP SGTGQTS AELEVQRRHS t*VS F 
WRIVPSPDWFVGVDSLDLCDGDRWREQAALDLYPYDAGTDSGF 
TFSSPNFATIPQDTVTE ITSSSPSHPANS FYYPRIjKAIjPP I ARV 
TLLRLRQSPRAF1PPAPVLPSRDNEIVDSASVPETPLDCEVSI.W 
SSWGLCX3GHCGRLGTKSRTRYVRVQPANNGSPCPELEEEAECVP 
DNCV 


5590 


72 


896 


LCS SG ALRLLPAM VAWRS AFLVCLAFS LATL»VQRGSGDFDD FNL. 
EDAVKETSSVKQPWDHTTTTTTNRPGTTRAPAKPPGSGLDLADA 
LDDQDDGRRKPGI GGRERWNHVTTTTKRPVTTRAPANTLGKDFD 
LADAIiDDRNDRDDGRRKPIAGGGGFSDKDLEDIVGGGEYKPDKG 
KGDGRYG SNDDPG SGMVAEPGT I AGVASALAMALI GAVSSY 1 SY 
QQKKFCFS I QQG1LNADYVKGENLEAWCEEPQVKYSTL.HTQSAE 
PPPPPEPARI 


5591 


68 


1494 


AGSSRRAAABRI_LVSAGCRSI>AGRASGVl,IiLPAEL.LPGEEEAMA 
LRVTRNS KINAENKAKINMAG AlOXVPTAP AATSKPGLRPRTALG 
DIGNKVSEQLQAKMPMKKEAKP S ATGKVI DKKLPKPLEKVPMIiV 
PVPVSEPVPEPEPEPEPEPVKEEKLSPEPIIjVDTASPSPMETSG 
CAPAEEDIiCQAFSPVIIAVNDVDAEIXSADPNLCSBYVKDIYAYIi 
RQIiEEEQAVRPKYLLGREVTGNMRAILIDWLVQVQMKFRLLQET 
MYMTVS I IDRFMQNNC VP KKMLQLVGVTAMF IASKYEEMYPPE I 
GDFAFVTDNTYTKHQIRQMEMKILRALNFGLGRPLPLHFLRRAS 
KIGEVDVEQHTLAKYLMELTMLDYDMVHFPPSQIAAGAFCLALK 
I LDNGEWTPTLQH YLS YTEESLI*PVMQHLAKNAAMVNQGI»TKHM 
TVKNKYATS KHAKI STLPQLNSALVQDLAKAVAKV 


5592 


242 


924 


YGES KDWNQKDLLSALVLTTVNCbPTP I MAKSAEVKLAI FGRAG 
VGKSALVWFLTKRFIWEYDPTLESTYRHQATIDDEVVSMEII.0 
TAGQEDT IQREGHMRWGEGFVLVYD I TDRGS FEEVI»PLKNI LDE 
I KKPKKVTL.ILVGNKADLDHSRQVSTEEGEKLATELACAFYECS 
ACTGEGNITE I FYELCREVRRRRMVQGKTRRRS STTHVKQAINK 
MLTKISS 


5593 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH 
SSGI VADLS EQSLKDGEERGEEDPEEEHELPVDMET INLDRDAE 
DVDLNHYRIGKIEGFEVLKKVKTLCI/RQmilKCIENLEELQSLR 
BLDLYDNQI KKI ENUEALTELE I LDI SFNLLRN I EGVDKLTRLK 
KLFL\HSINKISKIENLSNLHQLQMLELGSNRIRAIENIDT1,TNLE 
SLFIX3KNKITKLQNLDALTNLTVLSMQSNRL.TKIEGLQNLVNIXR 
ELYLS HNGI EVIEGLENNNKLTMLDI ASNRI KKIENI SHLTELQ 
E FWMNDNLIiES WS DLDE I»KG ARS LETVYL»ERNPIjQKDPQYRRKV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Scrine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








M1ALPSVRQI DATFVRF 


5594 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH 
SSGIVADLSEQSLKIX5EERGEEDPEESHELPVDMETINLDRDAE 
DVDLNHYRIGKI EGFEVLKKVKTLCLRQNLI KC I ENLEELQSLR 
ELDLYDNQI KKIENLEALTELEILDI SFNLLRNIEGVDKLTRLK 
KLFLVNNKIS KI ENLSNLHQLQMLELGSNRIRAIENIDTLTNLE 
SLFLG KNKITKLQNLDALTNLTVLSMQSNRLTKI EGLQNLVNLR 
EL YLSHNG I E V I EGLENNN KLTMLDIASNR I KKI ENISHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
MLALPS VRQI DATFVRF 


5595 


3 


1476 


ARWNGRW VQV PAW PG PG CGTNASGERQRQLPRANRP VGRTLGS E 
PIALAWSPPLYLFPIPLPSWAVSQPTPTLGTMFADLDYDIEEDK 
LG I PT VPGKVTLQ KDAQNL I GI S I GGGAQ YCPCL Y I VQ VFDNTP 
AALDGTVAAGDEI TGVNGRS I KGKTKVE VAKM I QEVKGE VT I H Y 
NKLQAD P KQG MSLD I VLKK VKHRLVENMS SGTADALGLSRAI LC 
NIX3LVKRLEELERTAELYKG>1TEHTKNLLRAFYELSQTHRAFGD 
VFSVIGVREPQPAASEAFVKFADAHRSIEKFG IRLLKT IKPMLT 
DLNTYLNKAI PDTRLTIKKYLDVKFEYLSYCLKVKEMDDEEYSC 
r ALGEPLYRVSTGNYE YRL I LRCRQEARARFSQMRKDVLEKM2L 
LDQKHVQDI VFQLQRLVSTMSKYYNDCYAVLRDADVFP I EVDLA 
HTTLAYGLNQEEFTDGEEEEEEEDTAAGEPSRDTRGAAGPLDKG 
GSWCDS 


559S 


£98 


219 


GAVLAPS SLPAAELAAQGE S QSLEDLSNTS RPTSE VYK IS F I FP 
NGDKYDGDCTR'TS SGI YERNGIGIHTTPNG 1 VYTGSWKDDKMNG 
FGRLEH FSGAVYEGQ FKDNM FHGLGTYTFPNG AKYTGNFNENRV 
KGEGEYTHIQGTRMDWTFHFTSCSQT 


5597 


3 


731 


I SCKMAADGQS SLPASWRSVTLTHVEYPAGDLSGHLLAYLSLS P 
VFVI VGFVTLI I FKRELHTIS FLGGLALNEGVNWL I KNVIQEPR 
PCGGPHTAVGTKYGMPSSR^QFMWFFSVYSFLFLYLRMHQTNNA 
RFLDLLWRHVLSLGLLAVAFLVSYSRVYLLYHTWSQVLYGGIAG 
GLMAI AWFI FTQEVLTPLFPRIAAWPVSEFFLIRDTSLI PNVLW 
FEYTVTRAEARNRQRKLGTKIiQ 


5598 


326 


2440 


GIGPIAAS FI FCKVASLYI FLS PPPPS VSG VPYSPANSS WSCAL 
VPLLGSGVPPHPPAPSPCCSGQTMLKMLSFKLLLLAVALGFFEG 
DAKFGERN EGSGARRRRCLNGN P PKRLKRRDRRMMS QLELLSGG 
EMLCGGFYPRLSCCLRSDS PGLGRLENKI FSVTNNTECGKLLEE 
I KCALCS PHSQS LFHS PEREVLERDLVLPLLCKD YCKEFFYTCR 
GHIPGFLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEISRKHKHNCFCIQEWSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYLD2HKLVQSGIKGGDERGLL 
SLAFHPNYKKNGKLYVSYTTNQERV1AIGPHDHILRVVEYTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYIILGDGM 
ITUDDMEEMDGLSDFTGSVIiRLDVDTDMCNVPYS I PRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTD INI NLT I LC S DS NGKN RS 
SARILQI I KGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSERL 
YGS YVFGDRNGNFLTLQQS P VTKQWQEKPLCLGTSGS CRGY FSG 
HILGFGEDELGEVYILSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQP AQTLTS ECSRLCRNG YCTPTG KCCCS PGWEGDFCRTG 


5599 


326 


2440 


GIGPIAASFI FCKVASLYI FLS PPPPS VSGVPYSPANS SWSCAlT" 
VPLLGSGVPPHP PAPS PCCSGQTMLKMLS FKLLLLAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNP PKRLKRRDRRMMSQLELLSGG 
EP^CGGFYPRLSCCLRSDS PGLGRLENKI FSVTNNTECGKLLEE 
I KCALCSPHS QSL FHS PERE VLERDLVLPLLCKDYCKEFFYTCR 
GHIP3FLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEISRKHKHNCFCIQEVVSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSGIKGGDERGLL 
S LAFHPN Y KKNGKLYVS YTTNQERWAI GPHDH I LRWE YT VSRK 
NPHQVrLRTARVFLEVAELHRKHLGGQLLFGPDGFLYI ILGDGM 
ITLDDMEEMDGLSDFTGSVLRLDVDTDMCNVPYSIPRSNPHFNS 
TNQP PEVFAHGLHDPGR CAVDRHPTDININLT I LCSDSNGKNRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine , N=Asparagine, 

Proline, Q=Glut amine, R=Arginine, 
S= Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 








SARILQIIKGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSERL 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
H I LGFGEDELGEVY I LS SS KSMTQTHNGKLYKI VDPKRPLMPEE 
CRATVQPAQTLTSECSRbCRNGYCTPTGKCCCSPGWEGDFCRTG 


5600 


1977 


1244 


S LRVLSGHLMQTRDL VQPDKPAS P KF I VTLDGVP S P PG Y MS DQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEM S ELS VAQKPE KLLERCKY W P ACKNGDE CAYHHP I S P CKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TF YHPT I NVPPRHALKW IRPQTS E 


5601 


1977 


1244 


SLRVLSGHIjMQTRDLVQPDKPASPKFIVTLDGVPSPPGYMSDQE 
EDMCFEGMKPVWQTAASNKGLRGLLHPQQLHLLSRQLEDPWGSF 
SN AEMS ELS VAQ KP EKLLERCKY W P ACKNGDECAYHHP I S PCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPP APPSS SQLCRY FPACKKME CP F YHPKHCRFNTQCTRPDC 
TFYHPTINVPPRRALKW I RPQTSE 


5602 


246 


766 


YHTS CTVWRTAKEALENTE VPVGCLM V YNNEVVGKGRN EVWQTK 
NATRHAEMVA I DQVLDW CRQSGKS PS E VFEHT VL YVTVE PC IMC 
AAALRLMKI PLWYGCQN E R FGGCGS VSNI AS ADLPNTGRP FQC 
IPGYRAEEAVEMLKTFYKQENPNAPKSKVRKKECQQILNMF 


5603 


1 


565 


FRGRTPI SGGERGCAQY PI PATPARSGENRTM PGAGDGGKAPAR 
WLGTGLLGLFLLPVTLSLEVSVGKATDIYAVNGTEILLPCTFSS 
CFGFEDLHFRWTYNSSDAFKILIEGTVKNEKSDPKVTLKDDDRI 
TLVGSTKEKRNNI S I VLRDLEFSDTGKYTCHVKNP KENNLQHHA 
TIFLQWDRRMQ 


5604 

I 


1 


1506 


EDIFPAQLLKLQRHERVWQQEPPVRDHRSWGGSGAGGVAGREWT 

GGDFGGGDFGGGD FGGGGS FGGHCLD YCES PTAHCNVLNWEQVQ 
RLDG I LS ET I P I HGRGNF PTLELQPS L I VKWRRRLAE KRI GVR 
DVRLNGSAASHVLHQDSGLGYKDLDLIFCADLRGEGEFQTVKDV 
VLDCLLDFLPEGVNKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
S LSNNSG KNVELKF VDS LRRQFEFS VDS FQI KLDS LLLF Y ECS E 
NPMTETFHPTI IGES VYGDFQEAFDHLCNKI I ATRNPEEIRGGG 
LLKYCNLLVRGFR PASDE I KTLQRYMCSRFFI DFSDIGEQQRKL 
ESYIXINHFVGLEDRKYEYLMTLHGVVNESTVCLMGHERRQTLNL 
ITMLAIRVLADQNVI PNVANVTCY YQP AP YVADANFSNY Y I AQV 
QPVFTCQQQTYSTWLPCN 


S605 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRAWSAGGPALGL 
MAAPVRLGRKRPLPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 
KALRSLRR YPLPLRSGKEAKILQH FGEGLCRMLDERLQRHRTSG 
GDHAPDSPSGENS P APQGRLAEVQDSSMP VPAQ PKAGGS GS YWP 
ARHSGARVI LLVLYREHLNPNGHHFLTKEELLQRCAQKS PRVAP 
GSARPWPALRSLLHRWLVLRTHQPARYSLTPEGLELAQKLAESE 
GLSLLNVGIGPKEPPGEETAVPGAASAELASEAGVQQQPLELRP 
GEYRVLLCVD IGETRGGGHRPELLRELQRLHVTHTVRKLHVGDF 
W7VAQETNPRDPANPGELVLDHI VERKRLDDLCSSI I DGRFREQ 
KFRLKRCGLERRVYLVEEHGSVHNLSLPESTLLQAVTNTQVIDG 
FFVKRTAD I KESAAYLALLTRGLQRL YCGHTLRSRPWGTPGNPE 
SGAMT3 PN PLCSLLTFSDFNAGAI KNKAQS VREVFARQLMQ VRG 
VSGE KAAALVDR YSTPAS LLAAYDACATPKEQETLLST IKCGRL 
QRNLGPALSRTLSQLYCSYGPLT 


5606 


3 


1099 


" GRSRCPGPGARGGTMS PRSCLRS LRLLVFAVFSAAASNWLYLAK 
LSSVGSISEEETCEKLKGLIQRQVQMCKRNLEVMDSVRRGAQLA 
IEECQYQFRNRRWNCSTLDSLPVFGKVVTGGTREAAFVYAI SSA 
GVAFAVTRACS SGELEKCGCDRTVHG VS PQGFQWSG CS DNI AYG 
VAFSQS F VDVRERSKGAS S SRALMNLHNNEAGRKAI LTHMRVEC 
KCHGVSG S CE VKTC WRAVPP FRQVGHALKEKFDGATEVE PRRVG 
SS RALVPRNAQFKPHTDEDLVYLE PS PDFCEQDMRSG VLGTRGR 
TCNKTSKAIDGCELLCCGRGFHTAQVELAERCSCKFHWCCFVKC 
RQCQRLVELHTCR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

-L ULa L. lull 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=£erine, T=Threonine, V=Valine, 
^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5607 


521 


141 


PPVCNPAEAMPS PGTVCSLLbLttMT »WLDLAMAGSSFLSPEHQRV 
QQRKBSK K P PAKLQPRALAGWLRPEDGGQAEGAEDELEVRFNAP 
FDVGI KLSGVQ YQQHSQALGKFLQD I LWEEAKEAPADK 


5608 


2 


983 


WFQSPLRQADPGPPRHTLFMDFVAGAIGGVCX3DAVGYPLDTVKV 
RIQTEPKYTGIWHCVRDTYHRERVWGFYRGLLIiPVCTVSLVSSE 
VFGTYR^CLAHICRIiRFGNPDAKPTKADITIjSGCASGLVRVFIjT 
S PTEVAKVRLQTQTQAQ KQQRRI>SASGPLAVPPMCPVPPACPE P 
KYRGPLHCIiATVAREEGLCGLYKGSSALVLRDGHSFATYFLSYA 

vlcewlspaghsrpdvpgvlvaggcagviawavatpmdviksri, 
qadgqgqrryrgllhcmvtivreegprvlfkglvlnccrafpvn 

MWFVAYEAVLRLARGLIiT 


5609 


1628 


304 


akgwvlpsppprpgrgalvsgsgi*rrgrsgtswrprrmnhksk 
krireakrsarpelkdsldwtrhnyyes fsls paavadnverad 
alqlsveefveryerpykpwllnaqegwsaqekwtlerlkrky 
rnqkfkcgedndgys vkmkmkyyi e ymestrdds plyi fdss yg 

EHPKRRKLLEDYKVPKFFTDDLFQYAGEKRRPPYRWFVMGPPRS 
GTG A HIDPLGTSAWNALVQGHKRWCLFPTSTPRELI KVTRDEGG 
NQQDE AITWFNV 1 YPRTQLPTW P PEFKPLE I LQKPGETVFVPGG 
WWHVVXiNI^TTIAITQNFASSTNFPVVWHKTVRGRPKLSRJCWYR 
ILKQEHPEIiAVLADSVDLQESTGIASDSSSDSSSSSSSSSSDSD 
SECESGSEGDGTVHRRKKRRTCSMVGNGDTTSQDDCVSKERSSS 

R ; 


5610 


54 


1196 


LERTPA3ADMAWTKYQL FLAGIiML VTGS INTI>S AK WADN FMAEG 
CGGSKEHSFQHPFLQAVGMFLGEFSCLAAFYLLRCRAAGQSDSS 
VDPC^PFNPI,LFLPPALCDMTCTSIJ^YVALNMTSASSFQMLRGA 
VI I FTGLFSVAFLGRRLVLSQWLGI LATI AGLVWGLADLliSKH 
DSQHKLSEVITGDIjLIIMAQIIVAIQMVIjEEKFVYKHNVHPIiRA 
VGTEGLF GFVI LSLiLLVPMYYI PAGS FSGNPRGTLEDALDAFCQ 
VGQQPLI AVALIiGNI SS I AFFNFAGI S VTKELSATTRMVLDSJLR 
TWI WAIiSLALGWEAFKALQI LGFIjIIiLIGTALYNGLHRPLLiGR 
LSRGRPLAEESEQERIiLGGTRTPINDAS 


5611 


2 


577 


fvlpnrlgipgstfrgpgacasssslaasakpgaggspalamsg 
elsnrfqggkafgllkarqerrlaeinrbflcdqkysdeenlpe 

KLTAFKE KYME FDIjNNEGE 1 DLMS LKRI4MEKLGVPKTHLEMKKM 
ISEVTGGVSDTISYRDFVNMMLGKRSAVZ.KLVMMFEGKANESSP 
KPVGPPPERDIASLP 


5612 


1 


721 


ASRDGYMDATIAPHRI PPEMPQYGEENHI FELMQAMWLCKHLNS" 
SliTLENLILNEFSYTATEARRLYLQRKTVPSALLVQLIQERLA 
EEDC IKQGWILDGI PETREQALRIQTIiG I TPRHVI VIiSAPDTVIi 
IERNLGKRIDPQTGEIYHTTFDWPPESEIQNRLMVPEDISELET 
AQKIiLE YHRN I VRV I PS Y PKI LKVI SADQPCVDVF YQAI/T YVQS 
NHRTNAPFTPRVLbliGPVGS 


5613 


115 


1279 


RGVDPALRRAEKMLtPIiSIKDDEYKPPKFNLFGKISGWFRSILSD 
KTSRNLFFFLCLNLS FAFVELLYGIWSNCI*GI»ISDS FHMFPDST 
AILAGLAAS VI SKWRDNDAFS YG YVRAEVLAGFVNGLFLI PTAF 
FI FSEGVERALAPPDVHHERLLLVS ILGFWNLIG I FVFKHGGH 
GHSHGSGHGHSHS L FNGALDQAHGHVDHCHS HEVKHGAAHSHDH 
AHGHGHFHS HDGPSIiKETTGPS RQ ILQGVFLHI LADTLGS IGV I 
ASAIKMQNFGLMIADP I CS I LIAI LI WS V I PLLRES VG I LMQR 
TPPLLBNS LPQCYQR VQQLQG V YS LQEQHFWTLCSDVYVGTLKL 
IVAPDADARWILSOTHMT FTOfi f:\7PriT .wnmcA tvm 


5614 


3 


1268 


LLSRNEHACPLQAGLGLTQRKPKAIRGREGRATNQGQGETQNER 
APWGARQRLG VMAELQQLQEFE I PTGREALRGNHSAJLLRVAD YC 
EDNYVQATD KR KALE E TMA FTTQALAS VAYQ VGNLAGHTLRM LD 
LQGAALRQ VE ARVS TLGQMVNMHM EKVARRE I GTIATVQRLP PG 
QKVIAPENLPPLTP YCRRPLNFGCLDDI GHG I KDLSTQLSRTGT 
LSRKSIKAPATPASATLGRPPRIPEPVHLPWPDGRLSAASSAS 
SliAS AGSAEGVGGAPTPKGQAAP PAPPLPS SLDP PPP PAAVE VF 
QRPPTLEELSPPPPDEELPLPLDLPPPPPLDGDELGLPPPPPGF 
GPDEPSWVPASYIjEKVVTLYPYTSQKDNEIjSFSEGTVICVTRRY 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K= Lysine, 
L=beucine, M^Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=*Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDGWCEGVSS EGTGFFPGNYVEPSC 


5615 


9 


1558 


AbGRRRPGDPREMEAAATPAAAGAARR E ELDMDVMRPLI NEQNF 
DGTSDEEHEQELLPVQKHYQLDDQEGI S FVQTLMHIiLKOJIGTG 
LLGLPLA IKNAGI VLG PI SLVFIGI IS VHCMHILVRCSHFLCbR 
FKKSTLGYSDTVSFAMEVSPWSCLQKQAAWGRSWDFFLVITQL 
GFCSVYIVFLAENVKQVHEGFLESKVFISNSTNSSNPCERRSVD 
LRI YMLCFLP FI I LLVFIRELKNLFVLS FliANVSMAVS LVI I YQ 
YWRNK PDPHNIiP I VAGWKKYPLFFGTAVFAFEGIG WLPLENQ 
MKESKRFPQALNIGMG IVTTLYVTLATLGYMCFHDE I KGSITLN 
LPQDVWLYQS VKILYSFGI FVTYS IQFYVPAE 1 1 1 PGI TSKFHT 
KWKQ ICEFG I RSFLVS ITCAGAI LI PRLDI VI SFVGAVSSSTLA 
L I L? PLVEI LTFSKEHYNI WMVLKNI SI AFTGWGFLLGTY ITV 
EEI I YPTPKWAGTPQSPFLNLNSTCLTSGIiK 


| 5616 


1 


719 


DD F VRCGPQ S AAMGAS ARLLRAV I MG APGSGKGTVS S R ITTHFE 
LKHLSSGDIjIjRDNMLRGT E IGVLAKAF I DQGKL I PDDVMTRLAL 
HELKNLTOYSWbLDGFPRTLPQAEALDRAYQIDTVINLir/PFEV 
I KQRLTARW I HPASGRVYNI EFNP P KTVG IDDLTGEP L I QREDD 
KPETVI KRLKAYEDQTXPVLEYYQKKGVLETFSGTETNKI WPY V 
YAFLQTKVPQR3QKASVTP 


5617 


176 


765 


P WRGRGSR PRGAGAMAEEQVNRSAGIiAPDCEASATAETTVS S VG 
TCE AAGKS PE PKDYDS TCVFCR I AGRQDPGTELLHCENEDLI C F 
KD I KPAATHH YIi WPKKH I GNCRTLRKDQVELVENMVT VG KTII» 
ERNNFTDFTNVRMGFHMPPFCS I SHLHLHVLAPVDQLGFLS KLV 
YRVNSYWFITADHLIEKLRT 


5618 
• 


3 


1692 


YLNYINLKSENKLSGKEDLWEKLQYL.WKSTLNLPEDLLRVPDES 
L FLN SGGDS I>KS I RLLSEI EKLVGTS VPGLLE 1 1 LSSS I IjE I YN 
H I LQTWPDEDVTFRKSCATKRKLSNINQEEASGTSLHQKAI MT 
FTCHNE INAF Wlf SRGSQI LSLNSTRFLTKLGHCS SAC P SDS VS 
QTNI QNLKGLNS P VI»I G KS KDPSCVAK7SE EG KP AIGTQKMELH 
VRWRSDTGKCVDASPLVVIPTFDKSSTTVYIGSHSHRMKAVDFY 
SGKVKWEQI LGDR I ESSACVS KCGNF I WGC YNGLVYVLKS NSG 
EKYWMFTTE DAVKSSATMDPTTGL I Y I GS HDQHAYALD I YRKKC 
VWKSKCGGTVFSS PCLNLI PHHLYFATLGGLLLAVNPATGNVIW 
KHSCGKPLFSSPQCCSQYICIGCVDGNLLCFTHFGEQVWQFSTS 
GPI FSSPCTSPSEQKI FFGSHDCF I YCCNMKGHLQWKFETTSRV 
YATPFAFHNYNGSNEMLLAAASTDGKVWI LESQSGQLQSVYELP 
GEVFSSPWLESMLIIGCRDNYVYCLDLLGGNQK 


5619 


2160 


1477 


DSPVLPTSGNVISTAQPAQPWSAVEAALRSLGSPPGAGRGCPCP 
AQS1»HSHQLAAWDPIjKPSI*RSYPPHLLQHPQLRSLTASSGHLGR 
RSCPQPRPLEELLRAGSSTRPQPLTSSCCGMSCMYSFLGHCSVL 
LWGTKGRGSGS PSS PGCCLHPPAQHSQDLPLVHVDVGWQPPLGP 
TVGLRPGLLGERQRGALRAGD PQCQCP LPATVREDLGVPSP WAA 
ECSPPATP 


5620 


930 


192 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRIiFQVEYAIEAI KLGST 
AIGIQTSEGVCIAVEKRITSPLMEPSSIEKIVEIDAHIGCAMSG 
L I ADAKTIi IDKARVETQNH WFTYNETMTVESVTQAVSNLALQ FG 
EEDADPGAMSRPFGVALLFGGVDEKGPQI*FHMDPSGTFVQCDAR 
AIGSAS EGAQSS LQEVYHKS MTLKEAI KS SL 1 1 LKQVMSEKLNA 
TNIELATVQPGQNFHMFTKEELEEVIKDI 


5621 


3 


819 


VvEFV EYTA1D/U^ VJvJN toJboo vyyiAa jlivj*i j. vi\.ivii\.r uoixL»Jvukj>\. 
ENDLTWVLKHCERFLKQQQTS IKSSLLCLQGNYAGHDW FVSSLF 
MIMLGDKEKTFQFLHQFSRLLTSAFLWLPRLHISSYLPNDTVES 
GIHPVYFCSTHYlEMbLKAELPLVFSAFHMSGFAPSQICLQWIT 
QCFWNYLDWIEICHYIATCVFLGPDYQVYICIAVFKHL.QQDILQ 
HTQTQDLQVFLKEE ALHGFRVS D YFEYME I LEQNYRTVLLRDMR 
NIRIiQST 


5622 


1122 


456 


' "AASTKDAVSRKRSHSASEKSGTGTSISKRLNMNPQIRKPMKAMY 
PGTFYFQFKNLWEANDRNETWLCFTVEGI KRRSWSWKTGVFRN 
Q VDS ETHCHAERC FLS W FCDDI I>SPNTKYQ VTWYTS WS P C PDCA 
GEVAEFLARHSNVNIiTIFTARLYYFQYPCYQEGLRSLSQEGVAV 
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ID 
NO: 
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beginning 
nucleotide . 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C= Cysteine . D=Asr>art i n Af-ir? t?_ 
Glutamic Acid, F=Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine / V= Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








E I MDY ED F KYCWENFVYNDNEP FKP WKGL KTN FRLLKRR LRESL 
Q 


5623 


3 


354 


FLPFFIRAPKISRNGQWLFTFTTPFPFANKALPGWEGIVPACFW 
RKKILTPSTGTME LLQVTI LFLLPS I CSSNSTGVLEAANNSI>W 
TTTKPSITTPNTESLQKNVVTPTTGTTPKGT1TNELLKMSLMST 
ATFLTS KDEGL KATTTDVR KNDS 1 1 SN VTVTS VTLPNAVS TLQS 
SKPKTETQS S I KTTEI PGS VLQPDASPSKTGTLTSI PVTI PENT 
SQSQVIGTEGGKNASTSATSRSYSS I ILPVVIALIVITLSVFVL 
VGLYRMCWKADPGTPENGNDQPQSDKESVKLLTVKTISHESGEH 
SAQGKTKN 


5S24 


159 


898 


PGVAAAAGALPQYHGPAPALVSCRRELSLjSAGSIfQLiERKRRDFT 
SSGSRKLYPDTHALVCLLEDNGFATQQAEI I VSALVKI LEAN.MD 
IVY KDMVTKMQQ E I TFQQ VMSQ I ANVKKDM 1 1 LEKS EFSALRAE 
NEKI KLELHQI*KQQVMDEVI KVRTDTKuDFNLEKSRVKELYSLN 

LDNIKYLAGSI FTCLTVALGFYRLWI 


5625 


1 


1180 


TI PSSAAAQRAG PPAGALEALS PGGARAHAERRG EMRATPLAA"P~ 
AGSLS RKKRLELDDNLDTERPVQKRARSGPQPRLPPCLIiPLS PP 
TAPDRATAVATASRU3PYVL,LEPEEGGRAYQALHCPTGTEYTCR 
VYPVQEAIAVLEPYARLPPHKHVARPTEVLAGTQLLYAFFTRTH 
GDMHSLVRSRHR I PBPEAAVLFRQMATAIAHOTQHGLVLRDLKL 
CRFVFADRERKKLVLEWLEDSCVLTGPDDSLWDKHACPAYVGPE 
I LS5 RAS YSG KAAD VWSLGVAL FTMLAGH YPFQDS E PVLL FGKI 
■"-^v^.*-* ± rtJ-»r«vjJji»%.t*>VKl_ij VKt_ijljKKliVAiiiCljXATGILIjHPWLRQ 

DPMPLAPTRSHLWEAAQVVPDGLGLDEAREEEGDREVVLiYG 


5626 . 


3123 


2011 


PPRALGSVAMENQVLTPHVYWAQRHRELYLRVELSDVQNPAI S I 
TENVLHFKAQGHGAKGDNVYEFHLEFLDLVKPEPVYKliTCRQVN 
ITVQKKVSQWWERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EBBRIiNKIjRLES EGS PETLTKLRKGYLFM YNLVQFLGFSW I FVN 
LTVRFC ILGKES FYDTFHTVADMMYFCQMLAWETI NAAIGVTT 
SPVLPSLIQLLGRNFILFI I FGTMEEMQNKAVVFFVFYLWSAIE 
I FRYS FYMLTCI DMDWKVLTWLRYTLWI PLYPLGCLAEAVS VIQ 
SIPIFNETGRFSFTLPYPVKTKVRFQPPT atvt tmtpt rr vy»tp 
RHLYKQRRRR YGQ KKKKIH 


5627 


3123 


2011 


PPRALGS VAMENQVLTPHVYWAQRHREL YLRVELSDVQNPAI S I 
TENVLHF KAQGHGAKGDNVYE FHLEFLDLVKPEP VYKLTQRQVN 
I T VQKKVSQ WWERIiTKQEKR PLFLAPDFDRWIiDES DAEMELRAK 
EEERLNKLRLESEGS PETLTNLR KG YLFMYNLVQFLGFSWI FVN 
LTVRFC ILGKES FY DT FH TVAJDMT^l Y FCQNILAVVET INAA I GVTT 
S P VLPSL I QLLGRN F I L F 1 1 FGTMEEMQNKAWFF VF YLWSAI E 
I FRYS FYMLTCI DMDWKVLTWLRYTLWI PLYPLGCLAEAVSVIQ 
S I P I FNETGRFS FTLP YP VKI KVRFS FFLQ I YLIM I FLGL Y INF 
RHLYKQRRRRYGQKKKKIH 


5628 


75 


1455 


VAGAMASKCIjKAGFSSGSLKS PGGASGGSTRVSAM ysss pcklp" ~ 
SLSP VARS FS ACS VGLGRSS YRATS CLPALCL PAGGFATS YSGG 
GGHFGEGI LTGNEKETMQS LNDRLAGYLEKVRQLEQENASLESR 
I RE WCEQQ VP YMCPD YQS YFRT I EELQKKTLCS KAENARL WE I 
DNAKLAADDFRTKYETEVSLRQLVESDINGLRRILDDLTLCKSD 
LEAQVES LKE ELLCLKKNHEEEVNS LRCQLGDRLNVE VDAAPP V 
DIjNRVLEEMRCQYBTLVENNRRDAEDWLDTQSEELNQQVVSSSB 
Q LQS CQAE 1 1 ELRRTVNALE I ELQAQHSMRDALESTLAETE AR Y 
SSQLAQKQCM ITNVEAQLAEIRADLERQNQE YQVLLDVRARLEC j 
EINTYRGLLESEDSKLPCNPCAPDYSPSKSCLPCLPAASCGPSA 
ARTNCS ARPI CVPCPGGR F 


5629 


2287 


938 


GRPRSS SDNRN FLRERAGLS S AAVQTR I GNS AAS RRS PAAR P PV 
PAPPALPRGRPGTEGSTSLSAPAVLWAVAVWVWSAVAWAMA 
NYIHVPPGSPEVPKLNVTVQDQEEHRCREGALSLLQHLRPHWDP 
QEVTLQLFTDGI TNKLIGCYVGNTMED WLVRI YGNKTELLVDR 
DEEVKSFRVLQAHGCAPQLYCTFNNGLCYEFIQGEALDPICHVCN 
PAI FRLIARQLAXIHAIHAHNGWIPJCSNLWLKMGKYFSLI PTGF 
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ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 

L.Tjaiim' no M— MohVii nniTid XT — 7\ eiT»a irz* ct i t*i 

P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








ADEDINKK FLSDI PSSQI LQEEMTWMKEILSNLGSPWLCHNDL 
LCKKJ 1 I YNE KQGDVQ F I D Y E Y SGYN YLAYD I GNH FNEFAG VSDV 
DYSLYPDRELQSQWLRAYLEAYKEFKGFGTEVTEKEVEILFIQV 
NQFALASHFFWGIiWALIQAKYSTIEFDFLGYAIVRFNQYFKMKP 
EVTALKVPE 


5630 


1194 


278 


GFVJAI AQTCAHHIjPPGS P WLVPAS PWRLPEMSS FGYRTLTVALF 
TLICCPGSDEKVFEVHVRPKKLAVEPKGSLEVWCSTTCNQPEVG 

KSNVSVYQP PRQVILTLQ PTLVAVGKS FTIECRVPTVEPLDSLT 
LFLFRGNETLHYETFGKAAPAPQEATATFNSTADREDGHRNFSC 
LAVLDLMSRGGN I FHKHSAPKMLE I YE P VSDSQMVI IVTWSVL 
LSLFVTSVLLCFIFGQHLRQQRMGTYGVRAAWRRLPQAFRP 


5631 


1053 


290 


SRVDDFVRPEPSRAEPSRSGRRRPARRAATMSVFGICLFGAGGGK 
AGKGGPTPQE A IQRLRDTEEMLS KKQE FLEKKIEQELTAAKKHG 
TKN KRAALQALKRKKR YEKQLAQ I DGTLST I BFQREALENANTN 
TEVLKNMG YAAKAMKAAHD.NMDI DK vDr, JjMQi? IAVQyJh,LxAlyt,xi3 
TAISKPVGFGEEFDEDEIiMAELEELEQEELDKNLLEISGPETVP 
LPNVPS IALPSKPAKKKEEEDDDMKELENWAGSM 


5632 


3 


952 


WLGWSPPRRLVJWGSLGAAQRPAVPVSGLiARSLHVETRRPHRRA 
S VRVARGRLGVWAQPQPLLPR P VG S RREMQPPG P P PAYAPTNGD 
FTFVS SADAEDLSGS I ASPDVKLNLGGDFI KESTATTFLRQRGY 
GWLLEVEDDDPEDNKPLLEELDIDLKDIYYKIRCVLMPMPSLGF 
NRQWRDNPDFWG P LAWLFFS MI S LYGQFRWS WI I TI W I FGS 
L»T I FLLARVLGGE V AYGQ VLG V I G Y S Jb Jj flii V XAPVjjj-iVVCjoFE 
WSTLI KLFGVFWAAYSAASLLVGEEFKTKKPLLI YPI FLLYI Y 
FIiSLYTGV 


5633 ■ 


771 


460 


qgcsktmsvgrpfyrssefmeqllsshi.hqvpffccftwci.cn 
clfensvsklymlcfnffmsiffyslsitklnliylwglsyqsl 
lllllsghrpwgssmv 


5634 


1446 


855 


pratgrirsraaasrpragagasgaeprsgrersrlsgrrapam 
arntls srfrrvdi defdenkfvdeqeeaaaaaaepgpdps evd 
GLLRQGDMLRAFHAALRNS p vntknqavkeraqg wlkvltn fk 
SSE ieqavqsldrngvdllmky I ykgfekptenssavllqwhek 
alavgglgs I irvltarktv 


5635 


3 


• 943 


drgprstatdtgrarvsfwrfpldpgvknsnvqisgekrrfrtl 
rslfhpfpvtrsgapravlvgsswpakmvapavkvargwsglal 
gvrravlqlpgltqvrwsryspefkdplidkeyyrkpveeltee 
ekyvrelkktqlikaapagktssvfedpviskftnmmmiggnkv 

IiARSLMIQTLEAVKRKQFEKYHAASAEEQATIERNPYTIFHQAL 
KNCEPMIGLVPILKGGRFYQVPVPLPDRRRRFLAMKWMITECRD 

HYRWW 


5636 


2253 


1143 


LEDTICQHP PAEKKLYLYHRKLREVERNG I PRLPKDVFMDTHQG 
LTDVRAKVTG fsegwds VKGGFS S FSQATHSAAGAWSKPRE I 
ASLIRNKFGSADNI PNLKDSLEEGQVDDAGKALGVI snfqss pk 
YGSEEDCS S AT SGS VGANSTTGGIAVGAS S S KTNTLDMQSSG FD 
ALLHEIQEIRETQARLEESFETLKEHYQRDYSLIMQTLQEERYR 
CERLEEQLNDLTELHQNE 1 LNLKQELASME EKI A YQS Y ERARD I 
QEALEACQTRISKMELQQQQQQWQLEGLElsrATARNLLGKLINI 
LLAVMAVLLVFVSTVANCVVPLMKTRNRTFSTLFLVVF I AFLW K 
HWDALFSYVERFFSSPR 


5637 


948 


2532 


MSFCGARANAKMMAAYNGGTSAAAAGHHHHHHHHLPHLPPPHLH 
HHHHPQHHLHPGSAAAVHPVQQHTSSAAAAAAAAAAAAAMLNPG 
QQQPYFPS PAPGQAPG PAAAAPAQVQ AAAAATVKAHHHQHSHHP 
QQQLDIEPDRPIGYGAFGWWSVTDPRDGKRVALKKMPNVFQNL 
VSCKRVFRELKMLCFFKHDNVLSALDILQPPHIDYFEEIYVVTE 
LMQSDLHK 1 1 VS PQPLSS DHVKVF LYQ I LRGLKYLHSAG I LHRD 
I KPGNLLVNSNCVLKICD FGLARVEELDESRHMTQEWTQYYRA 
PE ILMGSRH Y-SNAI D I WS VGCIFAELLGRR I LFQAQS P I QQLDL 
I TDLLGTPSLE AMRTACEGAKAH I LRGPHKQPS LP VLYTLSSQA 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
"corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M*Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








THEAVHLLCRMLVFDP YKR I SAKDALAHPYLDEGRLR YHTCMCK 
CCFS TS TG R VYTSDFE P VTNPK FDDT FB KN LS S VRQ VKE 1 1 HQ F 
ILEQQKGNRVPLCINPQSAAFKSFISSTVAQPSEMPPSPLVWE \ 


5638 


125 


1155 


DRKMSELDQLRQEAEQLKNQIRDARKACADATLSQITNN ID PVG 
RIQMRTRRTLRGHLAKI YAMHWGTDSRLLVSASQDGKL 1 1 WDSY 
TTNKVHAI PLRSS WVMTCAY AP SGN YVACGGLDNT CS I YNL KTR 
EGNVRVSRELAGHTGYLSCCRFLDDNQIVTSSGDTTCALWDIET 
GQQTTTFTGHTGDVMSLSLAPDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEL 
KTYSHDNI I CG ITS VS FSKSGRLLLAG YDDFN CNVWDAL KADRA 
GVLAGHDNRVSCLGVTDDGMAVATGSWDSFLKIWN 


5639 


125> 


1155 


DRKMSELDQLRQEAEQLKNQIRDARKACADATLSQITNNIDPVG 
RIQMRTRRTLRGHLAKIYAMHWGTDSRLLVSASQDGKLIIWDSY 
TTWKVHAIPLRSSWVMTCAYAPSGNYVACGGLDNICSIYNLKTR 
EGN VRVSRELAGHTGYLSCCRFLDDNQI VTS SGDTTCALWDI ET 
GQQTTTFTGHTGDVMSLSLAPDTRLFV3GACDASAKLWDVREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEL 
MTYSHDNIICGITSVSFSKSGRLLLAGYDDFNCNVWDALKADHA 
GVLAGHDNRVS CLG VTDDG F4AV ATGS WDS FLKI WN 


5640 

i 


2B0 


1092 


QQGNKKTMLSHNTMMKQRKQQATAIMK2VHGNDVDGMDLGKKVS 
I P RDZ MLEELS HLSNRGARL FKMRQRRS DKYTFEN FQYQS RAQI 
fclHSIAMQNGKVDGSNLEGGSQQAPLTPPNTPDPRSPPNPDNIAP 
GYSGPLKEIPPEKFNTTAVPKYYQSPWEQAISNDPELLEALYPK 
LFKP EGKAEL P D YRS FNR VAT P FGG FEKASRM VKF KVPD FE LL>L 

LTDPRFMSFVNPLSGRRSFNRTPKGWISENIPIVITTEPTDDTT 
VPESEDli 


5641 


27 


332 


CKHNCNGDVKLLS NQMD KLFAFHLFT FHGLLH FLDGS I QKL I QA 

EIILSDWSSILVLENNFIiFKVKSKQFIHLIAKKFYISITIVSAS 
NGESFVLSMIVTG 


5642 


199 


1247 


ITPCRMDFIjVLFLFYLASVLMGLVIiICVCSKTHSIiKGLtARGGAQ 
I FS CUP ECLQRAMHGL LH YL FHTRNHT FI VLHIiVLQGM VY TE Y 
TWEVFGYCQELELSLHYLLLPYLLLGVNLFFFTLTCGTNPGI I T 
KAN E LLFLHVYE FDEVMFP KN VRCSTCD LRKPARS KHCS VCNWC 
VHR FDHHCVWVNNCIGAWNI RYFL I YVLTLTASAATVAI VSTTF 
LVHLWMSDLYQETYIDDLGHLHVMDTVFLIQYLFLTFPRIVFM 
LGFVWLS FLLGG YLLFVLYLAATNQTTNE WYRGDWAWCQRCPL 
VAWPPSAEPQVHRNIHSHGLRSNLQEIFLPAFPCHERKKQE 


5643 


1 


847 


PSGGVRDVETRGPGSRAARGPRVVMHRRGVGAGAIAKKKLAEAK 
YKERGTVLAEDQLAQMSKQLDMFKTNLEEFASKHKQEIRKNPEF 
R VQ FQDMCATIG VD PLASGKG FWSEM LGVGDFYYELGVQII EVC 
LALKHRNGGLITLEELHQQVLKGRGKFAQDVSQDDLIRAIKKLK 
ALGTGFGI IPVGGTYLIQS VPAELNMDHTVVLQLAEKNGYVTVS 
EIKASLKWETERARQVLEHIiLKEGLAWLDLQAPGEAHYWLPALF 
TDLYSQE ITAEEAREALP 


5644 


B3 


113 8 


PRRMGSWVQLITSVGVQQNHPGWTVAGQFQEKKRFTEEVIEYFQ 
KKVSPVHLKILLTSDEAWKRFVRVAELPREEADALYEALKNLTP 
YVAIEDKDMQQKBQQFREWFLKEFPQI RWKIQES IERLRVIANE 
I EKVHRGCVI ANWSGSTGI LS VIGVMLAPFTAGLSLS ITAAGV 
GLGIAS ATAG I AS S I VENTYTR S AELTASRLTATS TDQLEALRD 
I LHD I T PNVLS FALDFDEATKM IANDVHTLRRSKATVGRPLIAW 
RYVPINWETLRTRGAPTR IVRKVARNLGKATSG VLWLDWNL 
VQDSLDLHKGEKSESAELLRQWAQELEENLNELTHIHQSLKAG 


5645 


537 


799 


VQS VRDLKR LS PTD P PGDSGNR D VTRE D P VTG PLNS AS SQ VPTL 
YLCLQNSLLGHSSVEDARATMELYQISQRIRARRGLPRLAVSD 


564 6 


3745 


3328 


AEQYGTS PHLLPTMLLSSCLPPANVTTKAATPPPLVLS LTTADP 
AGKPAPCRVTLTLLRAS I PATKRAS FLSS FI KMFFEELEYILGF 
LSLLKFHVHVS V YSAI CHFQKEGTGNSRS FTCTPELFPRLQTHL 
RAEGGAQ 


5647 


288 


800 


GVI ^4ATSELSCEVS EENCERREAFWAEWKDLTLSTRP BEGCSLH 
EEDTQRHETYHQQGQCQVLVQRSPWLMMRMGILGRGLQEYQLPY 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firsh 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C- Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine , T=Threonine , V=Val ine , 
W=Tryptophan / Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








QR VLPLP J FTPAKMGATKEEREDTPIQLQELLALETAIX3GQCVD 
RQEVAEITKQLPPWPVSKPGALRRSLSRSMSQEAQRG 


5648 


7 


1518 


VLS SLCGRHEALRE VGAEWPPPTCSPNI CSGLQQAGNTDWSLTM 
APQS LPS S RMAPLGMLLGLLMAAC FTF CLSHQNLK E FALTN PER 
SSTKETERKETKAEEELDAEVliEVFHPTHEWQALQPGQAVPAGS 
HVRLNLQTGEREAKLQYEDKFRWNLKGKRLDINTNTYTSQDLKS 
ALiAK FKEGAEMESS KED KARQAE VKRLFR P IE ELKKDFDELNW 
I ETDMQIM VRL I NKFNS S S SS LEE KI AALFDLE Y YVHQMDNAQD 
LLS FGGLQ WINGLNSTE P LVKE YAAFVLG AAFSSN PKVQVEAI 
BGGALQKLLVXLATEQPLTAKKKVLFALCSLLRHFPYAQRQFLK 
LGGLQVLRTLVQEKGTEVLAVRVVTLLYDLVTEKMFAEEEAELT 
OEMS PEKLQQ YRQVHLLPGLWEQGWCEI T VHLLALPEHDAREKV 
LQTLGVLLTTCRDRYRQDPQLGRTLASLQAEYQVLASLELQDGE 
DEGYFQELLGSVNSLLKELR 


5649 


1172 


3006 


MLQEQLDAINEEIRMIQEEKESTELRAEEIETRVTSGSMEALNL ' 
KQLRKRGS I PTS LTDLSLAS AS PPLSGRSTPKLTSRSAAQDLDR 
MGVMTLPSDLRKHRRKLLS PVSREENREDKATI KCETSP PSSPR 

KG I KSS IGRLFGKKEKGRL IQLSRDGATGHVLLTDSEFSMQEPM 
VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTWSWL 
ELWVGM PAW YVAACRANV KSG AIMS AL3 DTE I QRE I G I SNALHR 
LKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSLGLPQYRSYFMECLV 
DARMLDHLTKKDLRVHLKMVDSFHRTS^QYGIMCLKRLNYDRKE 
LEKRREESQHE I KDVLVWTNDQ WHV/VQ 5 1 GLRD YAGNLHE SG V 
H3ALLALDENFDHNTLALILQ I PTQNTQARQVMERE FNNLLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


5650 


1172 


3006 


MLQEQLDAINEEIRMIQEEKESTELRAEEIETRVTSGSMEALNL 
KQLRKRGS I PTSLTDLSLAS AS PPLSGRSTPKLTSRSAAQDLDR 
MGVMTLPSDLRKHRRKLLS PVSREENREDKATIKCETSPPSSPR 

Tr.RT<PK'T^;HP&T. c !nPPr!V < ?Zv7,prw2QNT><;ilQMCCr , incT u trr> 7\ ir t> 
x utujr, ruLAaxi f Alio ytJiuAortJjijL'i^ui'Ii roooWoo \JU^ljriz\.\-*/\i\H. 

KGI KSS I GRLFGKKE KGRLIQLSRDGATGH VLLTDS E FSMQEPM 

VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQV3DGPTWSWL 

ELWGMPAWYVAACRANVKSGAI MS ALSDTEIQRE I G ISNALHR 

LKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 

TDS EEGS WAQTLAYGDMNHE WTGNEWLP S LGLPQ YRS YFMECLV 

DARMLDHLTKKDLRVHLKMVDS FHRTS LQYG I MCLKRLMYDRKE 

LEXRREESQHEIKDVLVWTNDQWHWVQSIGLRDYAGNLHESGV 

HGALLALDENFDHOTI^ILQIPTQrn-QARQVMEREFl^LLALG 

TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 

GFRV3TLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


5651 


646 


1869 


ARQGQRQ PWG * EARAKGPASE S PRV * EGSGWEGPAS P * T PGSTL 
AWGEGAGI R+ ASGLTAAGAAS AAAA/ PPPTRGGPAPAGCGRAPP 
WPAPLRVPTHGRAPAPRS RAA PRAPALSHGTAAAALS PAS PAGP 
ADP* LPGHSSQS PPRG * RWGRS RS APAPAH PEH PAPAGS AS ASQ 
QTPGWPGSCCLAQGWQAEPLGAPGAEDG\PVPPQRGFPLGTLGS 
PAGSWAGLAGYG*AGAPGTQATAPRAAGQTPVAAAPNCRV*GSA 
PALHRAPAAADPGS PLQAPPRAWAS PAAAG PGLS SSDYCGGLGA 
GWRAGISPELLGAAGLSDNWARCPGPGPAB *GGQ PGCRTI PASA 
CM PS P PVEGS LGLS RKGHGDLPSQAR * GWHECRRARHLVPLPRL 
LGPRGRTGRPSSPS 


5652 


735 


343 


HHKKYQHIHQKS FSCPEPACGKS FNFKKHLKEHMKLHSDTRD YI 
CE FCARS FRTS SNLV IHRR IHTGE KPLQCE I CGFTCRQKAS LNW 
HQRKHAETVAALRFPCEFCGKRFEKPDSVAAHRSKSHPALLLA 


5653 


66 


1401 


RGRLQSRGRLTLGLVLLLLD I LGARQHGQRVSHGWKGGFLTAPL 
CFPQPCQPGTRRGRRRSLKEATEPQLAMAEEFVTLKDVGMDFTL 
GDWEQLGLEQGDTFWDTALDNCQDLFLLDPPRPKLTSHPDGSED 
LEPLAGGS PEATS PDVTETKNS PLMEDFFEEG FSQEI /SRD VTQ 
GWLLELQFRRSLYRGHLVR+FARRSRKSSEV*YCHQRGKSHGMQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acxd segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, I^Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
f-troime, Q==Glutamine, -R=Argimne, 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=p03sible nucleotide deletion, 
\=possible nucleotide insertion) 








r.t> -L/UiKJ.Ui>l„vaKJc*HijKKi?HU \DNVSKKTLTPAKSKEYRGEFF 
SYSDHSQQDSVQEGEKPYQCSECGKSFSGSYRLTQHWITHTREK 
PT VHQECEQG FDRKAS HSG YP KTHTG YKFYVCNE YGTP FS QSTY 
LWHQKTHAGEKPCKSQDSDHPPSHDTQSGEHQKTHTDSKSYNCN 

ECGKAFTR I FHLTRHQKIHTRKRYECSKCQATFNLRKHLIQHQK 
THAANV 


5S54 


3 


598 


TLPLFPGRRFRGWRRCGAVAARKNSTGGNVSINQRRDSVRMSAL 
NWKPFVYGGLAS I TAECGTFPI DLTKTR FQ I QGQTNDAKFKEI I 
YRGMLHALVR IGREEGLKALYSG * VGLHAFLCHCSLFHMG IDFR 
PRLHRS Q VKS LR CV* KEQ I A* * /M FSL>L I STI»I S K Y I Y YAADVL 
EKLFYY IQVQTDMNKKI CLFKN I 


56S5" 


2 


867 


RPPGIRAPRQLHPAAGRRPDASARPRFRPTVLLHDPFQLSFPPP 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 
ATDEMIPFKDEGDPQ\REKIFAEIVNPEEEGDLADIKSSLVNES 
EI I PASNGHE VARQAQTS Q E P YHDKAREH PDJDGKHPDGGLYNKG 
PSYSSYSGYIMMPNMNNDPYMSNGSLSPPIPRTSNKVPWQPSH 
AVHPLTPLITYSDEHFSPGSHPSH1PSDVNSKOGM5RHPPAPDI 
PTF Y PLS PGGGGQ I TP PLGWQGQP 


5656 


228 


1066 


PRR VP PLjPE FAS GPGAAF FHSGRI*QRS LTKDS AGCFSQCRS RAM 
IjVLRS Gl»TKAl»ASRTliAPQ VCSS FATG PRQ YDGTF YEFRTYYTjK 
PSNMNAFMENIiKKNIHLRTS YSELVGFWS VEFGGRTNKVFH I WK 
YDNFPHRAEVRKALANCKEWQEQS 1 1 PNLARIDKQETEITYLIP 
WSKLQKPPKEGVYELAVFQMKPGGPAJbWGDAFERAINAHVNLGY 
TKWGVFHTEYGELNRVHVLWWNESAD3RAAVRHKSHEDPISWG 
G VRESVNYL \ VSQQNM 


5657 


105 


1052 


GQRLQSPRVQMPVQPPSKDTEEMEAEGDSAAEMNGEEEESEEER 
SGSQTESEEESSEMDDEDYERRRSECVSEMLDLEKQFSELKEKL 
FRERLSQLRLRLEEVGAERAPEYTEPLGGLQRSLKIRIQVAGIY 
KGFCLDVIRNKYECELQGAKQHLESEKLLLYDTLQGELQBRrQR 
LEEDRQSbDLSSEWWDDKLHARGSSRSWDSLPPSKRKKAPLVSG 
P Y I VYML.QE I D I LEDWTAI K KARAAVS PQKRKS D \DI»DPAVHS Q 

GDPQSSWHCTQDSRLPPADRRTHRPLRVCPARLLWCCWALPLHL 
AI^VWTPPIj 


5658 


2346 


3541 


TERRVYNPWPEPDPD\CIQEDPWNLPNSIKTLVDNIQRYVEDGK 
NQLLI*A1»LKCTDTE LQLRRDA I FCQALVAAVCTFS EQLLAAIiGY 
RYNNNGEYEESSRDASRKWLEQVAATGVLLHCQSLLSPATVKEE 
RTMIjEDIVfVTIjSELDNVTFSFKQIjDENYVANTNVFYHIEGSRQA 
LKVIFYLDSYHFSKLPSRLEGGASLRLHTALFTKVLENVEGLPS 
PGSQAAEDLQOD1NAQSLEKVQQYYRKLRAFYLERSNLPTDAST 
TAVKIDQIi I RP INAUDELCRLMKSFVHPKPGAAGS VGAGL I P IS 
SELCYRLGACQMVMCGTGMQRS TLS VS LEQAAI LARSHGLliP KC 

IMQATDIMRKQGPRVEIIiAKMLRVKDQMPQGAPRLYRiCQPKMN 
GDL 


5659 


2 


696 


WKRSGEVSPKGELGAWRGNSGRPKIIGRAAEAENEDRTLGRLLP 
GNERSQPRSPLRLLAPQLKAEAAADKG1APVPPPFSSGHSGPC\ 
EREGEGQRGRGRSRRGAHLELKPSPGLRAGAPTDRGRGGPAEVA 
AAGGRRMVQKESQATLEERESELSSNPAASAGASLEPPAAPAPG 
Du^rrtu/i^b X/viVALjAAijOAKRr IiCG WEGFYGRP WVMEQRKEL 
FRRLQKVJELiNTYL | 


5660 


229 


853 


PVTMWAFSEI^MPLLINLIVSIJ J GFVATVTLIPAFRGHFIAAPX~ 
CGQDLNKTSRQQlPESQSVISGAVFIillLFCFIPFPFIiNCFVKE 
QRKAFPHHE FVALI GALLAI CCMI FLG FADDVLNLRWRHKLLLP 
TAASLPLLMVYFTNFGNTTI WPKPFRP I LGLHLDLGR * SYHCC 
P YGT YFRE P FLVLH I LLQVFI»FCIiCVFPD P FW 


5661 


2 


473 


lnlypspcggi pklipglpreaaaalgas flaeaplpvtvrgsgi* 
agmavtcdp:<afi,sicfvti,vflqlplasicqn*gtdscasrgk 
adfdvtgphap iltamagghvelqcqlfpni saedmelrwyrcqp 
slavhmhergmdmdgeqkwqyrgrt 


5662 


2 


1318 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRSVRFCSSA 
PF PKHKPS AKLS VRDALGAQNAS GERI Kl QG Wl RS VRSQKE VL F 



357 



BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine / C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenylalanine . G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M~Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion} 








LHVNDGSSLESLQWADSGLDSRELTFGSS VEVQGQI,I KS PSKR 
QNVELKAEKI KV IGNCDAKDFP I KYKERHPLE YLRQYPHFRCRT 
NVLGS I LRI RSEATAAIHS FFKDSGFVHIHTP I ITSNDSEGAGE 
LPQLE P SGKLKVPEENP FNVPAFLTVSGQIjHLEVMSGAFTQ VFT 
FG PTFRAENSQS RRHLAEF YMI EAEI S F VDS LQDLMQV I EELFK 
ATTMMVLSKCPEDVEIiCHKFIAPGQKDRL*HMLKNNFLIISYTE 
AVEILKQASQNFTFTPEWGADLRTEHEKYLVKHCGNIPVFVINY 
PLTLKP FYMRDNEDGPQEliEGSVA*HSLGLMII»LS I WIGQP 


5663 


119 


698 


PADIGRSTAKTPGPPRSLEMDDPRYGMCPLKGASGCPGAERSLL 
VQS YFEKGPLTFRDVA I E F S LEEWQCLDS AQQG LYRKVMLENYR 
NLVFLGIALTKPDLITCLEQGKEPWNIKRHEMVAKPPVICSIIFP 
QDLWAEQDI KDS FQEAI LKKYGKYGHANFQLQKGCKS VDECKVH 
KEHDNKLNQCLI PKKKK 


5664 


118 


572 


SLSMESNHKSGDGLSGTQKEAAIiRALVQRTGYSLVQENGQRKYG 
GPPPGWDAAPPERGCEIFIGKbPRDLFEDELIPLCEKIGKIYEM 
RMMMDFNGNNRGYAFVTFSNKVEAKNAIKQLNNYEIRNGRLLGV 
CASVDNCRLFVGGIPKTKK 


5665 


347 


702 


WQHLI ILLHCERTSPAMITSELPVLQDSTNETTAHSDAGSELE 
ETEVKGXRKRGRPGRPPSTNKKPRKSPGEKSRIEAGIRGAGRGR 
ANGHP QQNGEGEP VTLFE WKIX3KSAMQRC 


5666 


213 


540 


~ VSCLPTS CKM I TLNNQDQP VP FNS SHP DEY K I AAL»V FYS C I FI I 
GLFVNITALWVFSCTTKKRTTVTIYMMNVALVDLIFIMTLPFRM 
FYYAXDEWPFGEYFCQI LGA 


5667 


1 


695 


HPLPSASIiGIiPSVSLGVSLCVRSALLEAWPHLPKRRRARVGSP 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFVTGLARSKGFR 
VLDACS SEATHWMEETSAE EAVS WQERRMAAAP PGCTP PAIiLD 
I SWLTESl*GAGQPVPVECRHRI*EVAGPSKGPLS PAWMPAYACQR 
PTPIiTHHNTGLSEALEI LAEAAGFEGSEGRLLTFCRAASVLKAL 
PSPVTTLSQLQ 


5668 


691 


894 


CSFLFCIPDLFLQFLLGRKEEEAVLVGGEWSPSLDGLDPQADPQ 
VLVRTAIRCAQAQTGIDLSGCTKW 


5669 


407 


1 


DSGAPEGI*SPLMSTQEGJbSMHAHPQAYTPFIYLHARKRRGEIGD 
ADSRFNDR YAHKSAQIj Y FI*Y FVCW I FQDVYYFTI KEKNHFFFPK 
ARGAPTKYSGSPIGSPTTTPPTRPPSFNIiHPAPHIiLASMQLQKI* 
NSQ 


5670 


3 


373 


SSECLTMAWIPLLLPLLILCTVSVASYELAQPSSVSVSPGQTAK 
ITCSGDVLAKKYARWFQQKPGQAPVLVIYKDTERPSGIPERFSG 
S TSGTTVTLTISGAQVEDEAD Y FCYS ATDNFLWVF 


5671 


280 


524 


KFPPKKTPPHLGMESAITI*V7QFl4ljQLLLDQKHEHLICWTSNDGE 
FKLIjKAKKVAKIjWGIjRKHKTNMNYD KLSRALRIiLFMT 


5672 


2 


557 


FVPATPDPGVWLPPSRDPAMAKRSSLYIRIVEGKNLPAKDITGS 
S DP YC I VKVDNEP 1 1 RTATVWKTLCP Fl* GEE YQVHLP PTFHAVA 
FYVMDEDALSRDDVIGKVCLTRDTIASHPKGKFSLPSHTGLPSP 
WPPSHSETSPLGSVWSPAOGKPFLLSPEAGATFCTPGLCSAACS 
QAWLLLPLP 


5673 


327 


696 


ITVADQ I SHWSAGR I KNRTRI PECIHSSAA'ITJbAGPHTMEGESV 
KI*S SQTI*I QAGDDEKNQRT ITVNPAHMGKAFKVMNELRSKQLLC 
DVMI VAEDVEI EAHRWLAACS PYFCAMFTGDMS 


5674 


17 


984 


GGGSMEGESTSAVLSGFVLGALAFQHLNTDSDTEGFLLGEVKGE 
AKNS ITDSQMDDVE WYT IDI QKYIPCYOLFSFYNSSGEVNEQA 
LKKILSNVKKNWGWYKFRRHSDQIMTFRERLLHKNLQEHFSNQ 
DLVFLLLTPSIITESCSTHRLEHSIiYKPQKGIiFHRVPLWANLG 
MSEQIiGYKT VSGS CMSTGFSRAVQTHS 5 KFFEEDGS LKEVHKIN 
EMYASLQEELKS ICKKVEDSEQAVDKLVKDVNRLKREI EKRRGA 
QI QAAREKNIQKDPQENI FLCQALRTF FPNS EFLHS C VMSLKID 
MFLKVAVTTTTISM 


5675 


60 


753 


" EGSRRGPTRLARLS ARAGRDH FP PGFS SRLIHFRGVSECRRPPG 
KSGVPVSAPGSDGKWWEERPGMFSIJVIASCCGWFKRWREPVRKVT 
LLMVGLDKAGKTATAKGIQGE YPEDYAPTVG FSK I NLRQGKFE V 
TIFDI^GGIRIRGIWKNYYABSYGVIFVVDSSDEERMEETKEAM 
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SEQ 
ID 
NO: 


Predicted ~ 

W^o* i Tin i T^rt 

nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amtno acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, Islsoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glu t amine , R=Arginine, 
S = S erine , T-Threonine , V= Va line, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SEMLRHPRISGKPILVLANKQDKEGALGEADVIECLSLEKLWE 
HKCL 


5676 


2 


930 


FVSS PPPRP VQPARPGG FGLSGRRS LLCQVASTPAHVG VMRS P V 

RDLARNDG EES TDRT PLLPG APRAEAAP VCCS ARYNLAI LAFFG 

F FI VY ALRVNliS VAL VDM VDSNTTLEDNRTSKACPEHS AP I KVH 

HNQTG K KYQ WDAETQG W I LGS FFYGYI ITQX PGGYVAS KIGGKM 

LLGFG ILGTAVLTLFTP IAADLGVGPLIVLRALEGLGEGVTFPA 

MHAMWSS WAPPLERS KLLS IS YAGAQLGTVISLPLSGI ICYYMN 

WTYVFYFFGTIGIPWFLLWIWLVSDTPQKHKRISHYEKEYILSS 
L 


5677 


1 


1028 


PPRDGF^ELRRLSVPLCSGPCPLTSLSROGERSGGHLVAAARAA"" 
VTAETHPLPLLAPLAVCQSVKSPAACQVRPRPRAVALPAALGGP 
GRSL PGLTAATMSS FS E SALE KKLSE LSNSQQS VQTLS LWL IHH 
RKHAGPIVSWHRELRKAKSNRKLTFLYLANDVIQNSKRKGPEF 
TREFESVLVDAFSHVAREADEGCKKPLERLLNIWQERSVYGGEF 
I QQL KLSMEDS KS F P PKATEEKKSLKRTFQQ I QEEEDDD YPGS Y • 
S PQDPS AGPLLTEELI KALQDLENAASG0ATVRQKI AS LPQEVQ 
DVSLLEKI TDK3AAERLS KTVDEACLRNRGPGTS 


5678 


3 


593 


SSSPPSSTPSLPLPFYLLLGQLRLQLLWGTAHLSGAGEAAPCPG 
GSGRTAAPRTRADPAAQSLMIMNKMKNFKRRFSLSVPRTETIEE 
SLAE FTEQFNQLHNRRNENLQLGPLGRD PPQECSTFS PTDSGEE 
PGQLS PSVQFQRRQNQRRFSMEVRASGALPRQVAGCTHKGVHRR 
AAALQPDFDVSKRLSLPMDI 


5679 


2 


623 


LNSRVDDFVAVPGAIMDEDYYGSAAEWGDEADGGQQEDDSGEGE 
DDAEVQQECLHKFSTRDYII4EPSIFNTLKRYFQAGGSPENVIQL 
LSENYTAVAQTVNLLAEWL IQTGVEPVQVQETVENHLKSLLI KH 
FDPRKADS I FTEEGETPAWLEQMI AHTTWRDLFYKLAEAHPDCL 
NILNFTVKVGRVLELRRKVFMNVYFWLLVCFL 


5680 


258 


592 


RRLTSTSEKLQWRNSHTPLESLIHPQPSYKGFGIMFGKKKKKIE 
ISGPSNFEHRVHTGFDPQEQKFTGLPQQVmSLLADTANRPKPMV 
DPSCITPIQLAPMKTIVRGNKPC 


5681 
5*82 


45 


869 


LLCAKTLGVRTKESQAEG YNRSGINNHQAEDPRFCPSFCWMRSA " " 
RQTRPQRLRKEAARPPTPGSCPGGTGMDGKKCSVWMFLPLVFTL 
FTSAGLWI V YF IAVEDDKI LPLNS AERKFGVKHAP YI S I AGDDP 
PAS CVFSQVMNMAAFLAL WAVLR FI QLKPKVLNPWLN1 SGLVA 
LCLASFGMTLLGNFQLTNDEEIHNVGTSLTFGFGTLTCWIQAAL 
TLKVNI KNEGRRVG I PRV I LS AS I TLCVGPLLHPHGPKHPHVCS 
QGPVG PGHVL 




39 




PSRSCLGTMRKWRHREVNLPEVTQQDAVCPAPI PSPGLSAQTGL' " 
QKIWGTIHCQVCPGAPAWPGSPWHEEMGLLLLVPLLLLPGSYGL 
PFYNGFYYSNSANDQNLGNGHGKDLLNGVKLWETPEETLFTYQ 
GASVILPCRYRYEPALVSPRRVRVKWWKLSENGAPEKDVLVAIG 

T .DUD O D/^nvrv^nifT ft r~»r™vT-\ 

J-*Kt1Kc> r <aU YyGRVHLiRQD 


5683 


89 


778 


GSCGATALITRCLAWSVLISRLAMATYTCITCRVAFRDADMQRA 
K YKTDVJHR YNLRRKVASMAP VTAEG FQERVRAQRAVAEEES KGS 
ATYCTVCS KKFAS FNAYENHLKSRRH VELEKKAVQAVNRKVEMM 
NEKNLEKGLGVDS VDKDAMNAAIQQAI KAQPSMS PKKAPPAPAK 

EARNVVAVGTGGRGTHDRDPSEKP PRLQ WFECQAKKLAKHS EDD 
SEDEEHDLC 


5684 


195 


677 


TWCFRG YLGPR V IMXALDE P P YLTVG TD VSAKYRGAFCEAK I KT 
AKRL VK VKVTFRHDSS TVE VQDDH I KG PLKVGAI VEVKNLDGAY 
Q3AVINKLTDASWYTWFDDGDEKTLRRSSLCLKGERHFAESET 
LDQLPLTNPEHFGTPVIGKKTNRGRRYE 


5685 


779 


1262 


I^LQQPVVHCFLLFPPFRFSHHMIPGPPGPHTTGIPHPAIVTPQ 
VKQEHPHTDSDIJviHVKPQHEQRKEQEPKRPHIKKPLNAFMLYMK 
EMRANWAECTLKESAAINQI LGRRWHALSREEQAKYYELARKE 
RQLHMQLYPGWSARDNYVSPSS I PVALHS 


5686 


128 


1181 


CTWWQVN I TLLD INDNHPTWKDAP Y YI NLVEMTPPDSDVTTWA 
VDPDLGENGTLVYSIQPPNKPYSLNSTTGKIRTTHAMLDRENPD 
PHEAELMRK I WS VTDCGRPP LKATSS ATVFVNLLDLNDNDPTF 
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SBQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor r e sp ond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
' (A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QNLPFVAEVLEG I PAGVS I YQVVAI DLDEGLNGLVS YRMPVGMP 
RMDFLINSS SG WVTTTELDRBRIAEYQLRWASDAGTPTKSST 
STLTIHVLDVNDETPTFFPAVYNVSVSEDVPR\GSGWSG*AARN 
NDVGLNAELS Y F ITGGNVDGKFS VGYRDAWRT WGLDR ETTAA 
YMLI LEAIDNGPVGKRHTGTAT VFVTVLDVNDKRP1 ILQSSYV 


S687 


17 


917 


AAPPAPPDG/PPP/PPPAPPT/PGPAA/APASSCQPRLSAGRAA 
QGDGGAAAVGH VLWPAVGPVR VNPGLQTP VPRPELLPG P \ S SS 
LHSDSSYPPDAGLSDDEEPPDASLPPDPPPLTVP/ADA/PMPVT 
SGCRMPSTSASE/AAGGQGACTHAXGSETPP PAS PQTSE PAPS P 
LP PHLtTGGP GM YSSEAKJL»PNS FS CLGLAGTG AG I * GTASAHGTG 
PPVUPHVCTPSLANPQP\AVGPEASSLPIiGVSGlGMSA/SAPIS 
SS PFVAIGS CWLRGI PPPGSGFLCPGRAPG PVP ITTHGQEGQGP 
VLDI 


5688 


1 


420 


LT KWDL FGNC YRLLKTG I EHGAMP EQVGVYWYS/ CLYDSRKLFF 
*SHMI IRSLL*KVIDDSLGQLPLI.RELLL* *LNVIDRCI ILAYV 
LRVEKTFAITYLKNFTVKVDFSLLGE I PLISMAAILKLWI MKID 
DGYIPAVF 


5689 


1504 


3 


HELSGKHISMVSGNTCNWHPGGHSPGGGGQGEITSKDRGEIPAL " 
IWA/RK?IGTWTATKPTHRAG*GGAEEYQPPPQPCEGPRSTSRG 
GEG*GHAVGPGREIGKEGSL»PFIiGPKALGF*SASCQRAFEGGAH 
GSTAR KPAPAT PGTRHPRTMETR E VAQGWPAG PRSQFWDQH PHS 
PGEHRPSG \ S P L P ACPPRAW PKAGAVAS ATGTG \ PQL PGS RGKQ 
KLPRTRE PPLLQAGWAVRKP PWS EAKEGLGQAGRPSGMDSSAS \ 
PQTPGGRGSLEV?GLPLYLGPHHDVK*RSDRLG* PP * GGQGGGGH 
GAPSTFGPGGEAW*DPQQTSRPKPGPQAY*GE\GSPGLQCPCSK 
EL * RVP PGSLGPSTQOT YE PTDKHS \GG ADAQLEVS TAGS RS T F 
GQELKGPLDAGRLWPGAPS AS S SHR* GG * ERARAGAGH RGST * A 
SSKIEQGRPRPGPTSDALADVEGGAES/GPHPWPLPGTLPNR/P 
GS PPPA* AS AG RKGTVSTLGGGLL 


5690 


1424 


58 


PS PFAG VCAAPAP LPLLALARRDRR P CS PGAEAAP WQTGGPAI D ~ 
GAWRTS VSALRRGATG/ APCS PGAEAAP WQTGGPAI DG\DGELP 
*VRSEEAPRGCGAEGGGPGSGPVRRPGAGRGAHAGQGRQQDPEP 
DGLRHRQHGAASHARHRLQRLRPGHHQWRHVRRDPQAPPGGPAP 
GHAAALPERTRGVAE P P AWAHAGS DAWRAGR* SQRT * ERARPRH 
PTFQGRAG S \GQPG YQ PPNPH PGPSSP PAAP\GPRGA* GNPQLE 
KAPRSDRNPSC2GLRTRIRRPETPDCGPPSPAGSSASASTFRCTS 
SLS LLG P / PGAHNLDTAPQDR * HGP*GDKRGAPG VAGEDPR P P * 
GNFVR * LLLMP/GVA* RHGTS PFLGPSLGENGGQWDSGNLFGTP 
KG * SHPAFTKST * SMEAEKS YWNHPHR \DRGRQGVR I NCLRVGE 
SEMWGPYSAPRPGTVFLSSFLSPASEEH\PEGSSSFNTPFPPAG 
PEGDPGIjNS PGLI*P 


5691 


107 


550 


ISNDPS PGYN1EQMAKRGKKLVELPYTVKGMDVSFSG ILS FIED 
VAHRMLATGECTPEDLCFSLQVt^Q * KTGTES WG* RF Y I VEQN* S 
GDAPLI FS P YLSLTGNCGFAMLVEITERAMAH\ CGSPGGPSLWG 
GVGVYVLLESVPLSYS 


5692 


1193 


54 8 


TQAWTRAEKDRKGSVRALRLHLERGPPT*RGSHPIAQSVPCIQK ~ 
PS I FSSYP I /GLPQSGGEPGPVGEQQPVRRPEQPSCGPASRMPIi 
TSRSVPPGRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQ 
RLKLPVMGATRSNIjQPPRKVAVPGPTR*RDQDSKQDFSSKPLQS 
VPG1ASTQQTLTPADS GPGTGGRDATRAGLPG VETMGNG VD 


5693 


1258 


1330 


ALTVVPVRKGTTWWAQPHGCSNLVSRARLDLSSRPSQNTEPQAP " 
*QAGPPSSLRPP\SRRR*APEWPKRATGSRCRGLSAPPWPWPAA 
RGE/PGSAPSHAP/PNSPRPSGTRHP/PGPSSRVLYSPSLPRNS 
PEAI VWRS S RF PLWFP LRCCFW VSG FKDPNPVLRFF 


5694 


3 


1338 


GS KEP ARS LHRRGSGHKS S AGKWGSVTLS TAG ALG * KQLHQ * WT " 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNST1VLRTDSEKR 
SLAESGLSWFSESEBKAPKKLBYDSGSLKMEPGTSKWRRERPES 
CDDSS KGGELKKPI S LGHPGSLKKG KTPP VAVTS PITHTAQSAI/ 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPS TSGS FGYKKP P PATGTATVMQTGGSATLS KI QKS SG I P V 
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ID 
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5695 



Predicted 
beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
| nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
Sequence 



1338 



5696 



1338 



5697 



1147 



47 



5698 



666 



5699 



1448 



5700 



"92T 



597 



5701 



59 



410 



Ammo acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H-Histidine, I=Isoleucine, K-Lysine, 
L-Leucine, M=Methionine, N=Aspa ragine , 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W-Tryptophan, Y=Tyiro3ine , X=Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insert ion) 
KPVNGRKTSLDVSNSAEPGFLAPGARS NlQyRSLPRPAKSSSMS 
| VTGGRGGPRPVSSSIDPSIjIjSTKQGGLTPSRIjKEPTKVASGRTT 
PAPVNQTDREKEKAKAKA^ALDSDNISLKSIGSPESTPKNQASH 

ptatklaelpptplrataksevkppslanldkvnsnsldlpsss 

, DTTQCI 

gskeparslhrrgsghkssagkwgsvtlstagalg*kqlhq*wt 
qrcl\nni,sseefnasssl*hslpstptasrrnstivlrtdsekr 

SLAESGLSWFSESKEKAPfCKLEYDSGSLKMBPGTSKWRRERPES 

cddsskggelkkpislghpgslkkgktppvavtspithtaqsal 
kvagkpegkatdkgkiavkmtglqrsssdagrdrlsdakkppsg 
. iarpstsgsfgykkpppatgtatvmqtcgsatlskiqkssgipv 

KPVNGRKTSLDVSKSAEPGFLAPQARSNIQYRSIiPRPAKSSSMS 
VTGGRGGPRPVSSSIDPSLI^TKQGGIiTPSRLKEPTKVASGRTT 

I papvnqtdrekekakakavt^dsdnislksigspestpknqash 

PTATKIAELPPTPLRATAKSFVKPPSIANLDKVNSNSLDLPSSS 
| DTTQCI 

GS KE PARS ItH R RGSGHKSSjVS KWGS V TLSTAGALG * KQLHQ * WT 
QRCLVNNLSSEEFNASSSIa>JSiPSrPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSbKMEPGTSKWRRERPES 
CDDS S KGGE LK KP IS LGHPGSIiKXGKTP PVAVTS PI THTAQSAL 

KVAGKPEGKATDKGKlAVKNTGLQRSSSDAGRDRLSnAKKPPSG 
IAR PS TSGS FG YKK P P PATGTATVWQTGGS ATLS KIQKS SG I PV 
| KPVWGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
| VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
| PAPVNQTDREKEKAKAKAVALDSriTISLKSIGSPESTPKNQASH 

PTATKLAELPPTPLRATAKSFVKPPSIANLDKVNSNSLDLPSSS 
DTTQCI 

! PSBA LSPPACPSAPAPRRSriSRiFGTSPATEAAPPPPEPVPAA 
QG P ATVQS VE D FVP DDRLDR.S FX EDTT P ARDE K KVG AKAAQQDS 
DSDGEALGGNPMVAGFQDDVDLEDQPRGSPPI.PAGPVPSQDITL 
| SSEEEAEVAAPTKGPAPAPQQCSEPETKWSSI PAS KPRRGTAPT 
I RTAAP P WPGG VS VRTGP EKR. SS TPP PAEMEPGKGEQASSSESD P 
j EGP I AAQMLS FVMDDPD FES EGSDTQRRADDFP VRDD PSDVTDE 
DEG PAE PPPP p KLPLPAFRL* KNDSDLFGLGLEEAGPKESSEEGK 
EGRTPSKENKKKKKFCOWRPtt Pir&&rvtrQtruwevntrr.^ n 



RRQQRPPRSRERTAA 

GAEAAEPQEULPPLSQSSRFrQEQQKM NKSIiGPVSFKDVAVDFT 
QBE WQQLDPEQKI TYRDVML EWYSNLVS VGYHI IKPDVISKLEQ 
GEE P W I VEGEFLLQ S YPDEVWQ TDDLI ER IQEEENKPS.RQTVFI 
ETLI*R/ERGNVPGNTFDVETWPVPSRKIAYTHSI,CNSCER\GF 
NASSEYISSDGRYARMKADECSGOaCSI^IKDEKTHPGDQAVE 



RVRQP PGLW VRRTVPAMQCPAG LS RVPG VAG/DPS LPS FRGPR D 
EAAHRGTIQTARHTRKLYVQC3PASGPPLPRVSTQVAI*DEKPLA 
RPS /GRTNAPFPQGQKPAGKAAP0PAAAGRVAWR\ PGHPGLLAS 
DSQRSSSKGSGWETPVPWS*AQPGW\^SGIiLLLGDPSGPGSL+RS 

twlvggargpegsgvrgsgwpsgcsdigwalagwnhs*hi 1 dpnt 

WTQKWTGE/SPAPGEEG\VAPAPRGPTAEHGHCEIiTTESQYSNN 

vpilfqnpsgalrsrrtepagwpptrhe+ddg*taapasggap 

VSTPTWAGTP/LNASLGPTDPQGKPGCRPPCALPKPAGPERSA* 
GGSLGCR/SMLPASSGPPPAPGPRRIAAGA«TSASARCPPAAAA 
GWQPRRPGFAGRAALPGPPHPPSS *RELGGLPGPGW* TLDPLPA 
HPAHPPGSAPPWGALGGWAAAitASIiPWS PSLCLSFPAVTPVAGL 
FPPGRG 



NGHKGVWE JLN l Y*RRSNIHKN"SKSES HIiNQDHSFPPPTPNSARS " 
KLHS TGTAKNTGLPLSGAPRQRAVFSGRT ICQE FSS CLQCAYLD 
E*CSIASSLIKAILRVSVLSE 



IFEKICSDTQEFISPEINPQI CSWIiIFTjkGAK/N H A TGKDSIiFN 
KWSWKNWI*STCR*MRPGPYFTPYTKINSK*IK/DANIRCETVKIj 
LEENTGENLHDTGLGNVFLDMTPKTQPTKQK 
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amino acid 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C= Cysteine, D=Asparcic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K- Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PsProline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, +-Stop 
Codon, /^possible nucleotide deletion, 
\=po3sible nucleotide insertion) 


5702 


. 3 


1517 


ETFVDPSQCGGIPSDSPHPVITPSRASESSASSDGPHPVITPSR 
ASESSASSDGPHPVITPSRASESSASSDGLHPVITPSRASESSA 
SSDGPHPVITPSRASESSASSDGPHPVITPSRASESSASSDGLH 
PV I TPSRASES SASSDGPHP VITPSWSPGSDVTL.LAE ALVTVTN 
IEVINCS I TEI ETTTSS I PGASDTDLIPTEGVKASSTSDPPALP 
DS TEA KPH I TEVTAS AETLSTAGTTE SAAPHATVGTPIiPTNSAT 
EREVTAPGATTLSGALVTVSRNPLEETSALSVETPSYVKVSGAA 
PVSIEAGSAVGKTTSFAGSSASSYSPSEAALKNFTPSETLTMDI 
TTKGPFPTSRDPLPSVPPTTTNSSRGTNSTLAKITTSAKTTMKP 
PTATPTTARTRPTT\A*VQVKMEVSSSCG*VWLPRKTSLTPEWQ 
KG* CSSSTGNSTPTRLTSRSPYCVSGEANG/ PSAAARHVP YAKR 
GCCP * PGPPPTDCS CVT VLRGTQ KVPMKGSMS KPLTPDVATGPS 
LTSTGVYVWGGASPVPRGVLGLTLAHVLCFSKEKT 


S703 


14 


1117 


HHKDSRSQGLPRTQECARPEIJ*PLLCPRALWPVTRLSYRCPWQA 
P KAG IGTKAKP S ESHLKLH PGW PS LDRQGE PATLGTGTGHCSDS 
R I LRWHP * HTAAR* PRWRRLPSSHRWTRHLGVuRVQDKS* *VSL 
DPSCRPRFLRTC+*YGMRSVASSSNPPPGWSGPGASVFPARPVS 
ALPTGPRCW*APRGRTRQPCGWPRLSSPHATADWGPGCPLSPSR 
GSWETAPGS * WCPWL* AARWTGWRTASGASAGLGRAADRPS AWA 
RRVAGLLPGQGLTVRR* H* TAGAPAS VRSSQGATRSPAPGGDQC 
ACGRGPGSC*HPPPWPVSPSSPVPCPSGR*HLRGPLLSAARPRA 
AGWPRHS PHDTQTPEP 


5704 


23 


562 


GDYEFDSPYWDDISQAAKDLVTRLMEVEQDQRITAEEAISHEWI 
SGNAASDKNI KDGVCAQ I EKNFARAKWKKAVRVTTLMKRLRAPE 
QS STAAAQS AS ATDTAT PGAAGGATAAAASG ATSAPEGDAARAA 
KSDNVAPRRP * LPPQPQMEVPPQPLMAVSPQPPMEASLQ PLMGE 
SPQP 


5705 


23 


562 


GD YEFDSPY WDDI SQAAKDLVTRLMEVEQDQRITAEEAI SHEWI 
SGNAASDKNI KDGVCAQ I EKNFARAKWKKAVRVTTLMKRLRAPE 
QS STAAAQS AS ATDTATPGAAGGATAAAAS GATS APEGDAARAA 
KSDNVAPRRP * LPPQPQMEVPPQPLMAVSPQPPMEASLQPLMGE 
SPQP 


5706 


1161 


610 


QLGRFXAQDTVAIRKVKEVFGTGAMRHWILFTHKED*GGQALD 
DYVANTDNCSLKDI»VRECERRYCAFNNWGSVEEQRQQQAELIiAV 
I E RLGREREGS FHSNDLFLDAQLLQRTG AGACQEDYRQ YQAKVE 
WQVEKHKQELRENESNWAYKAIjIjRVKHIjMLLHYE I F VFLLLCS I 
LFFIIFDF 


5707 


28 


609 


GSPAPTPGFRRRPGRGTPSPGTRHHQGRAEPEPDAPERAPLRR* 
MFAIQPGLAEGGQFLGDPPPGLCQPELQPDSNSNFMASAKDANE 

nwhgmpgrvbpilrrssses psdnqafqapgs peegvrs ppega 
eipgaepekmggagtvcs pledngyas sslsidsrssspepacg 
tprgpgppdpllpsvaqa 


5708 


44 


1925 


sfsweeti spcfpkmpae pwwlspvslgaagwpgqprpyldlpa 
qasvsrphdra*geavslslssgdvcghtdgggagsdpqakpkp 
prcpftampsprtkqkvrnkvouliairysdipsdvskap\gpa 

GNPHDRSSTAA*IiHRRAGAGSLCLSASI»LPPSFSLGAPGAPSPL 
rvspasggprkegrqgsgg * AGGGGP \ARTHADLPCVGFVCS pp 
llk*sdspvkqlpa\sgqgsgagmppvgssdilrprptsvsgtg 
raag * cswqpaacctprsq* wavarspsrcsrw*rqsgr*rg* s 

SRRRRGP* AAGRStpavp * pcq ♦nnzwTR'P & Ynr , wiY:ianvap cp * 
LEPSGPTSGSAL* twashstga* *srlcgtagtgplcsqssrs * 

AG *RCCCTAAS PCGGSGPSHPGSPSAHCLSWSGGRTQPRAPSAH 
GRGRAMGSRCVCTCTGLPCPG I PLSGASPGGSGETGAGRSHTLK 
AARSRLS PRPGSGSRGSY* SHNDNWGTWPAPPSAGHXLVGG *ns 
QRTSSDH * YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 
PPRPPRLPAAAS /SGGASGS PAASCSCSCRAPAKPASS /GEAPA 
PPPRPEPPPPPARRP 


1 5709 


2 


2C31 


ITLCPLPQTEKCIaNVVTEAATPLGIYLKARVEAGGLKEbEISWG 
IjHQIVVRWGAVVMRAGMGGCRCWGVMAPFAPR/NALS FbVNDCS 
LIHNNVCMAAVFVDRAGEWKLGGLD YM YSAQGNGGGPPRKG1 PE 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionir.e, N^Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y~Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
Vpossible nucleotide insertion) 








LEQYDPPELADSSGRWREKRSADMWRU3CLIWEVFNGPLPRAA 
ALRNPG KIP KTL»VPH YCE L VGAN P K V K PN P AR F LQNCRA PGGFM 
£>imkf viiiwjjirijiiiiiQXK^PAEKQKFFQEIjSKSLDAFPEDFCRHK 
VLPQLLTAF E FGNAGAWLTPLFKVG KFLS AEE YQQKI I PVWK 
MFS STDRAMR IRLLQQMEQFIQ YLDEPTVNTQI FPHWHGFLDT 
NPAIREQTVKSMLLLAPKLNEANLNVELMKHFARLQAKDEQGPI 
RCNTTVCI/3KIGSYLSASTRHRVLTSAFSRATRDPFAPSRVAGV 
LG FAATHNL YSMNDCAQKI LP VLCG LT VD PE KS VRDQAFKA I RS . 
FLS KLE5 VSEDPTQLEE VEKDVHAASS PGMGGAAAS WAGWAVTG 
VSSLTSKLIRSHPTTAPTETNIPQRPTPEGVPAPAPTPVPATPT 
TSGHWETQEEDKDTAEDSSTADRWDDEDWGSLEQEAESVLAQQD 
DWSTGGQVSRASQVS \TPTTNPPNPQS PTGAAGK\ RGLLGTGLA 
GAKLPGATS *RYTAGQRV 


5710 


1 


562 


I PGST I SCEVELMARMAKT IDSFTQNQTRLWI I DG LDACEQDK 
VLQMLDTVRVLFS KGPF I AI FASDPHI 1 1 KAINQNLNSVPSGFK 
\LNGHDYMRNI VHLPVFLNSRGL/RQ/ LQENFS * LQQQMETFHA 
QILQG YRKKLTEEFHRTALGR *QNLVARQPS I DG* DAIGFELYV 
CIAIQFNTNKDDAT 


5711 


1526 


1130 


RRH PFQWTTVTQEAFSHHDVAFTSTPVLFYPDS AQP FI VKS ESS 
SQ I AKAVLSQQR PS LFHECAFHF FS * SLQRHTINLDQGI F+ LLM 
LSEERQHLFESS/ IWTTPHNLK* / FEIHEHLGSHEGHWTLFFLL 
QIL 


5712 


3 


1391 


GRKLFQSLDISERLKFLLTLDCVDDTLIVLAEEHGCLDIIKELP 
ETVI D! » T tN KCLT FHPS KR PTPDELMKDK VFS E VS PL YTP FTKPA 
S LFSS SLRCADLTLPEDI SQLCKDINNDYLAERS I EE VYYLWCL 
AGGDLEKELVNKEI IRSKPPI CTLPNFLFEDGESFGQGRDRSS / 
TFR * YHWDIWMPAKK+ 1 ERCWGRS I LP ITLKMTSLI LPYSNSN 
NELS AAATLPL 1 1 REKDTE YQLNR 1 1 LFDRLLKAYP Y KKNQI W K 
EARVDI P PLM RGLT WAALLG VEGA I HAK X DAIDKDT P I PTDRQI 
EVDI PRCHQYDELLSSPEGHAKFRRVLKAWWSHPDLVYWQGLD 
S LCAPFLYLNFNNEALVYACMSAFI P K YLYNFFLKDNSHVI QE Y 
LTVFS QM I AFH D P ELSNHLNE I GF I PDL YA I PW FLTM FTHVFPL 
HKI FHLW \ DTLLLGEFLFP I L YWE 


5713 


634 


284 


P VCAVP VDRW PVL PRE DQEGQQL* AKLPRDFRR * FQ I LGPMEGH 
TACRCSRRGAQVQHLPRED IRAAE *DPHLREVWPGLPTSS ATS P 
♦ RAVLTS P CSHLGS ADAAS SHWLCGVS FH 


5714 


212 


613 


WGLGLG PTMS S LGGGS QDAGGSS SSS TWGS GGSGS SG PKAGAAD 
KSAWAAAAPASVADDTPPPERRNKSGI ISEPLNKSLRRSRPLS 
HYSSFGSSGGSGGGSM^GESADKATAAAAAASLLANGHDLAAA 
MA 


5715 


131 


1979 


ESASQQKRSKCLILTLKLELSGSAPKKTSARPGSSLWLPPHSQE 
QTP PASKLQGGGGGLQTGWGLH P VPVTAAS PLPRWCLFGAVAK \ 
GLPGP * LCPSGAA/ GGLQRGPGLS PLGAAG KVSCLH PPSMVENN 
DSTCHEHHEGILAARVTPVP\SGKPGRVLKPPGRVCRPPHPAAS 
PRPPGS / SDLDGPRPQMHLRAFPAAHGGPVNTPHGGEEKTFMSS 
QIRRKETKPL* RKTPAG\NNYQSNSI PVSQSPQLTVDLLPSAGR 
TQAPSGRGDAGKPTPGHG \LPKAS VILTPNCPCSLAGGQ* PPGL 
x fi\i i'KVKKWKKFb/ LLQiPSQ*GSRQSTC*EV\GALGEPVRI PG 
L* PDLS CI LSNGSKHRREGLS FPRSLGPGRRGPAGLQSLGCS PT 
PKNTACHS SGHVALQAGHDS ARDVGSGHVALQAGHDS TQDVGRP 
VWRWI PLE * LGLSRETGQATRRGLVWISPGRAAAACVACAOALE 
EGPLRLPGQDRGAQPCSHCPGRAAGQPEPGAGAPCRE/GG*DPT 
GLT/GVPGTDPKRGGRKPGQSGQETQGPTVisrSGPESPLQPKP*E 
RQE / VGAGAS SGVGLS RGRAGGPSSAWEVAAMLLLLRHGSHSEL 
TDLTEAQTSQH 


5716 


1711 


1370 


RVFSLLCEGPGHCYQGAVCREACAAASPG L.DSAAEPHRLCEHTD 
*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYAC* 
RCPLVL* SGFFTI I VGG YSCCMPLKT 


5717 


44 


1489 


LPTEALRESEWVSEYGKCGPRGLVPEGESTS PLPSSVDTEDSLD 
EGPGALVLESDLLLGQDLEFEEEEEEEEGDGNSDQLMGFERDSE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E- 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine r I^Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyxosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








™a LibHK riaijf iCajbo UUh. i^CjLrtjKA LiS Aisi h. v EEPARGPGEARGE 
RPGPACQLCGGPTGEGPCCGAGGPGGGPLLPPRLLYSCRLCTFV 
SHYSSHLKRHMQTHSGEKPFRCGRCPYASAQLVNLTRHTRTHTG 
^Jvtr * »\v^irtii-±rr/ii w i5£>ijorJijKKriyx I tLfvjtrk' I FPCPTCGF RCCTP 
RPARPPSPTEQEGAVPRRPEDALLIiPDLSLHVPPGGASFLPDCG 
Q\ CGVKGRAS AGLDQNHCQS /S I*FP WTCRG CGQELE EGEGS RLG 
AAMCGRCMRGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 
RHMKTHSGEKPFRCARCPYASAHLDNLKRHQRVHTGEKPYKCPL 
CP YACGNLANL KRHGR I HSGDKPFRCSIjCNY S CNQS MN1> I RHM 


5718 


120 


284 


VAHALSLPAESYGNDVSMTHPQLPPTQLAWDLCRTCLPLSYNFT 
S * * STADPLHL 


5719 


48 


428 


ELNNGPFQMPLCNGGNIiAVTGSWADRSPLHEAASQGRLLALRTI, 
I»S QG YNVNAVTLDH VT P LHEACLGDHVACAR7LLE AGANVNAIT 
I DG VTPLFNACSQGS PSCAELLLEYGAKAQP \ESCLPSP 


5720 


1 


1051 


LQAFRNASEVPMVLVGTQDAISAA\NPRVYRRTSRARKI,STDLK 
\RCT\YYE\TCGGTYGLQMMSVSFQDVAQKWAIi\RKKQQ\IjAI 
GPCK\SLPN\SPSH\SAVSAAS I PARAPINQGHE/SGGGSAFSD 
Y\SSSVPSTPSISQREI,RIETIAASSTPTPIRKQSKRRSNIFTS 
RKGADP\DREKKAAGCKVDSIGSGRAIPIKQGILIiKRSGKSLNK 
EWKXKYVTLCDNGLLTYHPSIiHDYMQNIHGKEIDLLRTTVKVPG 
KRIiPRATPATAPGTSPRANGLSVERSNTQLGGGTGAPHSASSAS 
LHSERPLSSSAWAGPRPEGLHQRSCSVSSADQWSEATTSLPPGM 
QHPASG 


r> / z.x 


97 


492 


RHSSPCCSLRRTERSSNAAVST/TTVQQFKRFIENYRRHIGCVA 
VF YAI AGGLFIiERAY YYAFAAHHTG I TDTTRVG 1 1 LSRGTAAS I 
SFMFSYILLTMCRNLITFLRETFLNRYVPFDAAVDFHRLIASTA 


5722 


88 


1043 


VALDVLAGS S PGGGMAGALIjGPR VHG I RAVLRVARGG VGAPG AP 

gslgvshaaapparpqgaaqsphrgrr:4ggggaglppprsprfp 

QES VPAS TSTARG PRRVSRR1>PPQH PGPRGRRRR PGAGVGAPRR 
GRARGQAGLLGRQGQGGRGAERERAALQARRGRRPGPEPDQS cg 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
PPPPPHLGALTAGSGEERQSQPRAETIjRLGRGAPLP\PRAERGG 
RPKQAEQQQ\ PKRPTPPARG PQSSGDPAMLPQRAGIiRTGGIiAGT 
KSSTREIPEMI 


5723 


88 


1043 


VALDVLAGSS PGGGMAGALLGPRVHG I RAVLRVARGGVQAPGAP - 
GSLGVSHAAAPPARPQGAAQSPHRGRRHGGGGAGLPPPRSPRFP 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAGLLGRQGQGGRGAERERAALQARRGRRPG PEPDQS CG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRP PADAPAPAPAPA 
PPPPPHLGALTAGSGEERQSQPRAETI*RLGRGAPLP\PRAERGG 
RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPC2RAGLRTGGLAGT 
KSSTREIPEMI 


5724 


3 


1841 


FTNEAPPAPLPDASASPLSPHRRAKSLDRRSTEPSVTPDLIiNFK 
KGWLTKQYEDGQWKKHWFALADQSLRYYRDSVAEEAADLDGEID 
LSACYDVTEYPVQRNYGFQIHTKEGEFTIiSAMTSGIRRNWIQTI 
MKHVHPTTAPD VTSSLPEEKNKS SCS FETCPRPTEKQEAELGEP 
DPEQKRSRARE\RRREGRSKTFDWAEFRPIQQALAQERVGGVGP 
ADTH \ D P WR P EAEHG ELERERARRREE RR KRFGMLD ATDG PGTE 
DAALRMEVDRSPGI*PMSDI»KTHNVHVE I EQR WHQVETTPLREEK 
QVPIAPVHLSSEIX3GDRLSTHEl*TSLIiEKEI>EQSQKEASDt»LEQ 
NRLLQDQLRVALGREQSAREGYVLQATCERGFAAMEETHQKKIE 
DLQRQHQRELEKLRE EKDRLLAEETAAT I S AI EAMKNAHREEME 
RELEKSQRSQISSVNSDVEALRRQYLEEIiQSVQRELEVLSEQYS 
QKCLENAHIJVQAIjEAERQALRQCQRENQELNA^QELNNRLAAE 
ITRLRTLLTGDGGGEATGSPLAQGKDAYELBVPSGARPCLTQLC 
TQEPQGSAAWPLSYRWGGTDLRQQESQGPGRSKSPEGGEEQ 


5725 


3 


104 9 


VNGHSEE TSQS PNRTEPHDSDCS VDLG I SKST3DLS PQKSGPVG 
S WKSHS ITNMEIGGLKI YDILSDN\DLSSHLQPLK/ FTSAVDG 
KNI VRSKAATLLYDQPLQVFTGS SSSSDIiI SGTKAI FKFDSNHN 
P E /GAKYNKR PHKWAHNLHLKYMVLHS 1 ISNTVAV\RSQRHFVA 
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to first 
amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corre s ponding 
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amino acid 
sequence 


Ammo acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Ieoleucine, K=Lysine, 
L= Leucine, M*=Kethionine, N=Asparagine, 
P=Proline f Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








^* ^^wArLyriSijAPo/ VDWKAU/ ±Nyi» 3TAKHSANMNFSNHN 

NVRANTAYHLHQRLGPARHGEMWAISPNDRLIPAVTRSTIQRQS 

SVSSTASVNLGDPGSTRRAQIPEGDYLSYREFHSAGRTPPMMPG 

SQRPLSARTYSIDGPNASRPQSARPSINE1PERTMSVSDFNYSR 
TSP 


5726 


2 


4 86 


j SRSbSMWWNSGJjPASSHSSKLPVTVGFSGCVKRLRLHGRPLGAP 
j TRMAG VTP C I LG PLE AGLF FPG S GG VI TL / ES VG AG I PG PS RAG 
j QGS PGGSGEGPPLSSPSQPLPADLPGATLPDVGLELEVRPLAVT 
GLI FHLGQARTPP YLQLQVTEKQVLLRADDG 


5727 


21 


221 


RPILILKETRR1.PWATGYAEVINAGKSTHWEDQASCEVLTVKKK 
AGAVTSTPNRNSS KRRSSLPNGE 


5728 


2 


877 


GTRNGQFE PRRGRAWEGSAGGLRAPGAAAGG PG VQPRGSG / LPG 
NAIRAGVNPGRGPASPFWDLSLPWDLWPPPTDHAPGAPDFPAVE 
GR\PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLSPSQ 
GG PAGAGGDAG / LPGRCPS APWRAGS RPAAS CPDWI PGPQGLWL 
HRN PT S /G P P SQ I GEGAEQGDEG VADA PQ I QC KN/ GAEDPPAED 
EPPQVPEAGEEDAVPAEEGPGGTPETQADQVRERPEAHItAEGGA 
KGS PRRLADPQDLPAGQMSLAPPFPP VAAVI RSNK 


5729 


1 


1525 


AGG AR E VLTLQIjGHFAGF VGAHKWNQQDAALGRATDS KE P PG EL 
CPDVLYRTGRTLHGQETYTPRLILMDLKGSLSSLKEEGGLYRDK 
QLDAAIAWQGKLTTHKEELYPKNPYLQDFLSAEGVLSSDGVWRV 
KS I PNGKGSS PLPTATTPKPLI PTEAS I RVWSDFLRVHLHPRSI 
CM I Q K YNHDGE AGRLEAFGQGES VLKE P K YQEELEDRLH FYVEE 
CDYLQGFQILCDLHDGFSGVGAKAAELLQDEYSGRGIITWGLLP 
GPYHRGEAQRNIYRLLNTAFGLVHLTAHSSLVCPLSLGGSLGLR 
PE PPVS FPYLH YDATLPFHCSAILATALDTVTCS \ Y RLCSS PVS 
MVKL\ADMLSFCGKKWTAGAI I PFPLAPGQSLPDSLMQFGGAT 
PWTPLSACGEPSGTRCFAQSWLRGIDRACHTSQLTPGTPPPSA 
LHACTTGEEILAQYLQQQQPGVMSSSHLLLTPCRVAPPYPHLFS 
SCSPPGMVLDGSPKGAAVESVPVFG 


5730 


1258 


1713 


KKFQ APARKTC VECQKTVY PM E RLLANQQVFH ISC FRCS YCNNK 
LSLGTYASLHGRI YCKPHFNQLFKS KGNYDEG FGHR PHKDLWAT 
KI ETEGFWER PRNFENC/GR PLKS PGGEDCPSC* GG CPGSN Y * AQ 
GSSSREKGGQASWNPKLRVA 


5731 


122 


443 


RSHRGELIPKDSCYMRJCPPRRPKKRRQG/CALPQGCLTFKDVAI~ 

EFSLEEWKCLNPAQRALYRAVMLENYRNLESVGLTSKEJSWYMRK 

KPGRGRGKQRRQEWFFLRVY 


5732 
5733 


226 


772 


PPSRSCQSPKRKSRRRAHVTVTLVCGFTSFSFSLPLYLCGCLRF 
PERTCSQLQQADWAPDFGPSSFVPSWGATATGARKFLIAFNI\N 
LLG TKE QAHR I ALNLREQGR GKDQPGRLXKVQG I G W YLDEKNLA 
Q VS TNLLD FEVTALHT VYE ETCREAQELS LPWGSQLVGLVPLK 
ALLDAA 




1 


460 


PALQEVNANALAWGKQYENDARTLFEFTSGVNDTESPI I YRDES 
MRTACS PDGLCSDGNGLELKCPFTSRD FMKFRLGGFEAI KSAYM 
AQVQYSMW VTRKNAWYFAN YDPRMKREGLH YWI ERDEKYM\AS 
FDEI \ VP \ EFIGKMDE VLSRD PM 


5734 


3 


968 


R CNS PES LTSLLVLLTTANIf LFVLI PAYS KNRAYAI F F I VFT VI 
GSLFLMNLLTAI I YSQ FRG YLMKSLQTS LFRRRLGTRAAFE VLS 
SMVGEGGAFPQAVGVKPQNLLQVLQKVQLDSSHKQAMMEKVRSY 
GSVLLSAEEFQKLFNELDRSWKEHPPRPEYQSPFLQSAQFLFG 
HYYFDYLGNLIALANLVS ICVFLVLDADVLPAERDDFI LGILNC 
VFIVYYLLEMIJ.KVFALGLRGYLSYPSNVFDGLLTVVLLVLEIS 
TL\VCTDCHTQAGGRRWW/RLLSLWEMTRMLNML I VFRFLR IIP 
SMKPMA WAS TVLGL 


5735 " 
5736 


2 
1 


540 
382 


F FTPCVARAFNF PDQATVKKAA YSLPR VGGGTS CGLPQARR I SL 
ATPRQL YK/ SSNMTQRWQRR E I SNFE YLMFLNTIAGRT YNDLNQ 
YP VFP W VLTNYES EELDLTLPGNFRDLS KP I GALNPKRAVF YAE 

RYETWEDDQSPPYHYNTHYSTATSTLS WLVRI VS I FIELACLWY 
LKILT 

GTRPSTKKSGYSPOQVAVIHCKGHQKENTAVAHSNQKADSAAQV 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
ire s i due of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=StOp 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








TARLSVTPPNLLPTVSFPQPDLPDNPVYSTTTEKLASDLRANKN 
QES* * ILPDSGI PI P *T*TS YLQSTTHLRRAKLPQLLRR 


5737 


290 


1041 


KACLHLLSS FLTSNFLFNPLLPDS LYS VEARSQRANLGPCRRKR 
LQTLMRLAAG FQYSSHKDPSLSAKEKETDYHNEARGPWPGWVG * 
RTADGSCGRGPDGAHHPGPKSSSWRASRLLPGLGGSHHLDAYVG 
RDLEGGTP APLQLE I P PQ PRGHPAP I PTGQAGPRDSG PGAS P * V 
ETRPLTDGRR * PGVR PVG WTP AHPAGTL RPRGAVEPS VS ACGKW 
APS PTS QG CCEGRCDAVP KHRAWRT PLCSQ 


5738 


8 


460 


DTLSLNCTLPETLPMTPSF* LS FL* FPGIiARAKS I PTKTYSNEV 
VTLWYRPPDILLGSTDYSTQI DMW * GQVEVWQGPCGKGGGBVTT 
ATQ?AAFI>FTVPSI,PRGVGCIFYEMATGRPIjFPGSTVEEQLHFI 
FR I liS E EAW AL CAVE THR 


5739 


1 


1222 


S FQRRG I RWTJVHTLH PH PRAV W AG I GRGHGS * AULGRARAPALC 
FPTLLEFLESLEPDLPALRAMGLHLWAAGPGTHPAGISDLLAEV 
SAEVDGPVPG YLSS PQS I TDTCLY I FTSGTTGLPKAAR I SHLKI 
LQCQGFYQLCGVHQEDVIYLALPLYHMSGSLLGIVGCMGIGATV 
VLKSKFSAGQFWEDCQQHRVTVFQYIGELCRYLVNQPPSKAERG 
HKVRLAVGSGLR P DTWERFVRR FG PLQ VLETYGLTEGNVATIN Y 
TGQRGAVGRASWLYKHI FPFSLIRYDVTTGEP I RDPQGHCMATS 
PGEPGLLVAPVSQQSPFIjGYAGGPELAQGKLLKDVFRPGDVFFN 
TRDIiLVCDDQGFLRFHDRTGDP FRWKGENVATTE VAEV FEALD F 
LQEVNTVYGVTV 


5740 


2S5 


231 


PAYWLKVPTLCIjESKTDLREKASHVSAQIjQGEVRGIiAGA1»WM*A 
YVYERVYN * N I S RMVHALEQKRHPAGIjS s smalqlnpclgmlma 

lqselhklydeetqswvsgsacggyp 


5741 


1 


650 


PRKTMRRGVLMTLIjQQSAMTIiPLWIGKPGDRPPPLCGAIPASGD 

warpgdkvaarvkavdgdeqwiiaewsyshatnkyevddide 

EGKERHTIiSRRR V I PLPQ W KANPBTDPE AI»FQKEQLiVLAIt YPQT 
TCFYRALIHAPPQRPQDDYSVIjFEDTSYADGYSPPLNVAQRYVV' 
ACKEPKKK*CRLADSPSPNDTGQDSRGRAGIKHIPPLKKK 


5742 


2 


362 


TQSVKEILKRNPNVNLTDKDGNTALMIASKEGHTEIVQDLlrDAG 
TYVNI PDRSGDT VLIGAVRGGHVE I VRAIxLQKYAD I DIRGQDNK 
TAIi YWAVE KGN ATMVRD I LQCNPDTE I CTKDG 


5743 


2 


415 


GKTPEGIDAI EE I E IDLEETEREISPQENGLEEVKPIiGEMQTDI* 
KATGREI S PREKTPEVIDATEEIDKDIiEETGRREI S PEENGPEE 
VKPVDEMETDLKTTGREGSSREKTREVIDAABVIETDLEETERE 
ISPQE 


5744 


3 


703 


TRRTTTTSPTTTRQMTTTPAAIiPTTVVTTPDLTTGTPLQMTTIA 
VFTTANTCLSLTPSTIiPEEATGLLTPE PS KEGPI LTAES ETVZiP 
SDSWSSAESTSADTVLLTSKESKVWDLPSTSHVSMWKTSDSVSS 
PQPGASDTAVPEQNKTTKTGQMDGIPMSMKNEMPISQLLMIIAP 
SIX3FVLFALFVAFLI.RGKLMETYCSQKHTRIjDYI gdsknvlndv 
QHGREDEDGLFTL 


5745 


1400 


599 


GKSRFVNLMKHSKKTYDSFQDELEDYIKVQKARGIiEPKTCFRKM 
KGDYLETCGYKGEVNSRPTYRMFDQRLPSETIQTYPRSCNIPQT 
VENRI,PQWLPAHDSRLRIjDSLSYCQFTRDCFSEKPVPLNFNQQE 
Y ICGS HGVEHRVYKHFSSDNSTSTHQASHKQ IHQKRKRHPEEGR 
EKSEEERSKHKRKKSCEEIDLDKHKSIQRKKTEVEIE'TVHVSTE 
KLKNRKEKKSRD WS KKEERKRTKKKKEQGQERTEEEMLVJDQS I 


5746 


3 


821 


S FASGRLTPS S PAFDGELDLQRYSNGPAVS AWSLGMGAVSWS ES " 
RAGERR FPCPVCGKR FRFNS I LALHLRTHQPERPRS PAARI*I»I*E 
LEERALLREARLGRARSSGGMQATPATEGIiARPQAPSSSAFRCP 

AHGAPERP LAATS AAPP PQPQPQP PPQPE PRS VPQ PEPEPQ PER 
EATPTPAPAAPEEP PAPPEFRCQVCGQS FTQS WFLKGHMRKHKA 
SFDHACPV 


574 7 


2 


1328 


DRHVETLC IHFLGPSTGSTAKTGGRNWLKTGNCLYGNTCRFVHG 
PSPRGKGYSSNYRRSPERPTGDLRERIKNKRODVDTEPQKRNTE 
ESSSPVRKESSRGRHREKEDIKITKERTPESEEENVEWETNRDD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 

(A-Alanine, c=Cysteine, D=Aspartic Acid, E= 

Glutamic Acid, F=Phenyl alanine, G=Glycine / 

H-Histidine, I=Isoleucine, K=Lysine, 

L= Leucine, M=Methion.i.ne, N=Asparagine, 

P=Proline, Q=Glut amine, R=Arginine, 

S=Serine, T^Threonine, V=Valine, 

W^TrvDtophan . YsTvroaine y-tinV-TinT.m #-c>-«« 
» — j t* *- / y i u jil. / a — unknown , -otop 

Codon, /=possible nucleotide deletion, 

\=possible nucleotide insertion} 








SDNGD INYDYVHELSLEMKRQKIQRELM KLEQENMEKREE 1 1 1 K " 

AVSS PLLDQQRNS KTNQS KKKG PRTPS PP P PI PEDI ALGKKYKE 
KYKVKDR I E EKTRDG KDRGRDFERQR EKRD KPRS TS PAGQHHS P 
ISSRHHSSSSQSGSSIQRHSPSPRRKRTPSPSYQRTLTPPLRRS 

ASPYPSHSLSSPQRKQSPPRHRSPMREKGRHDHERTSQSHDRRH 
ERREDTRGKRDRE KDS R EFR F YPOTiOc; c cu nwo nrvwir r>o rtr~or>o 
RE 


5748 


934 


473 


SEGPQVFYKGLAPTLIAIFPYAGLQPSCYSSLKHLYKWAIPAEG 
KKNENLQNLLCGSGAGVISKTLTYPLDLFKKRLQVGGFEHARAA 
FGQVRRYKGLMDCAKQVLQKEGA1.GFFKGLSPSLLKAALSTGFM 
FFSYEFFCNVFHCMNRTASQR 


5749 


552 


1 


GFPVDPRVRGSTLSLAERPKGMIRSGSFRDPTDDVHGSVLSLtAS 
ortoo i i^^/^&Kwyifc,uiKKLjI^ELESSQEKVATIjTSQLSANAN 
LVAAFEQSLVNMTSRLRHLAETAEEKDTELLDLRETIDFLKKKN 
SEAQAVIQGALNASETTPKELRIKRQNSSDSISSLNSITSHSSI 
GSSKD AD A ! 


5750 


22 


866 


I FI S I CLWNAHL CFLLLP KDCIDQVMKLQNLFVT>DS GR YLAI Q F 
IILEWAYVFIjYYYEYRKAKDQLDIAKDISQLQIDLTGALGKRTRF 
QENY VAQL I IjD VRREGDVLSN CE FT PAPTPQEHLTKNLELNDDT 
ILNDI KLADCEQFQMPDLCAEEI Al I LG I CTNFQKNNPVHTLTE 
VELIAFTSCIiLSQPKFWAIQTSALILRTKLEKGSTRRVERAMRQ 
TQALADQFEDKTTSVLERLKIFYCCQVPPHWAIQRQLASLLFEL 


5751 


3 


751 


SCGS ALRAWRCG AAAIiAT FPAPA1/ PGLMYRAL YAFRS AE PNALA 
FAAGETFLVLERSSAHWWLAARARSGETGYVPPAYIjRRIjQGLEQ 
D VLQAI DRAI EAVHNTAMRDGGKYS LEQRGVLQKL I HHR KETIiS 
ftn'jr^rt^avMVPi i 1 oJJllnJ^AAAARQPNGVCRAGFERQHSijP 
SSEHLGADGGLFQI PLPS SQ I PPQPRRAAPTTPPPPVKRRDREA 
LMASGSGGHNTMPSGGNSVSSGSSVSSCI 


5752 


3 


471 


VVj "' 3 v v/ irWK^jfvrtsjVLaVjLstj RAALnbAb L» P C7LS GAAT 
VEREMELRHKNEMLR VETE ARARAKAER EfTAD I IREQIRLKASE 
HRQTVLES I RTAGTL FGEG FRA FVTDRD KVTAT VN I F I KQG WQ V 
AERQHVGAS WS PRS CPCRIiCTAIi 


5753 


34 


483 


DDSXA I PGGVQAP FGAVRM I YTPRTGHR I RKLDQ I QSGGN YVAG 
GQEAFKKLNYLDIGEIKKRPMEVVNTEVKPVIHSRINVSARFRK 
PLQEPCTI FLIANGDLINPASRLLI PRKTLNQWDHVLQMVTEKI 
TLRSGAVHRLYTLEGRLV 


5754 


14 


331 


TLVHVVEFAGEHAEAIASREQEVIjGGWKELIjSACEDARLHVSST 
ADALRFHSQVRDLLSWMDGIASQIGAADKPROPSSLLGLPASPW 
WPTPATPS PCTAPFSME 


5755 


3 


888 


LGDQFYKEAIEHCRSYNSRI^AERSVRLPFLDSQTGVAQNNCYI 
WMEKRHRGPGIiAPGOLYTYPAPC'WPWPDt t-tddttfjd vt orrcTv- 

PEVELPLKJOX5FTSESTTLEALLRGEGVEKKVDAREEESIQEIQ 
RVLENDENVEEGNEEEDLEEDIPKRKNRTRGRARGSAGGRRRHD 
AAS QEDHDKP YVCD I CG KR YKNRPGLS YHYAHTHLAS EEGDEAQ 
DQETRS P PNHRNENHR PQKG PDGT VIPNNYCDFCLGGSlTMNKKS 
GRPEELVS CADCGRSAHLGGEGRKE KEAAA 


5756 


3 


621 


SS KLQALFAH PL YNV PE E P P LLGAEDSLLASQEALR Y YRRKVAR " 
WNRPJ^KMYREQ^INLTSLDPPLQLRr J EASVn7QFHLGINRHGLYSR 
S S P WSKLLQDMRHF PTIS AD YS QDEKALLGACDCTQ I VKPSG V 
HLICLVLRFSDFGKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEI 
AAFHLDRILDFRRVPPTVGR I VNVTKEI L 


5757 


3 


473 


YKDALLLPDNHRQWFENGTLKLTDVQKGMDEGEYLCSVLIQPQ 
LSXSQSVHVAVKVPPLIQPFEFPPASIGQLLYIPCWSSGDMPI 
RITWRKDGQVI ISGSGVTI ESKEFMSSLQI SSVSLKHNGNYTCI 
ASNAAATVSRERQLIVRVPPRFW 


5758 


1 


474 


FRRGAGAERGEHREGERGAAGMGEFKVHRVRFFNYVPSGIRCVA 
YNNQSNRLAVSRTDGTVEI YNLSANYFQEKFFPGHESRATSALC 
WAEGQRLFSAGLNGE IMEYDLQALNI KYAMDAFGGPI WSMAASP 
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SEQ 
ID 

NO i 


Predicted 

beginning 

nucleotide 

location 

cor respond i ng 

to first 

amino acid 

residue of . 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H-Histidine, I=lsoleucine, K=L»ysine, 
L^Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S -Serine, T^Threonine, V^Valine, 
W=Tryptophan, Y=t Tyrosine, X=T)nknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








SGSQLLVGCEDGSVKLFQITPDKI PV 


5759 


2 


1240 


GNAAFAGQGVVYETFHMSDLPSYTTNGTVHVWNNQIGFTTDPR 
MARS SPY PTDVARWNAP I FHVNADDPEAVI YVCS VAAE WRNTF 
NKDVGADLVCYRRRGHNEMDEPMFTQPLMYKQIHRQVPVIjKKYA 
DKL I AEGTVTLQEFEEEI AKYDRI CEEAYGRSKDKKILHIKHWL 
DS PWPG FFNVDGEP KSMTCPATG I P EDMLTHIGS VASS VPIiEDF 
KI HTGLSR I LRGRADMTKNRTVD WALAE YMAFGSLLKEG I HVRL 
NGQDVERGTFSHRliHVLHDQEVDRRTCVPMiraLWPIDQAPYTVC^f 
S SLSEYGVLGFELGYAt^ASPNALVLWEAQFGDFHNTAQCI IDQF 
ISTGQAKWVRHNGIVLLLPHGMEGMGPEHSSARPERFLQMSNDD 
SDAYPAFTKDFEVSQI* 


576C 


1 


1221 


VRD1 TSDSLS LS WT VP EGQ FDKFIjVQFKNGDGQP KAVRVPGHED 
GVTISGLEPDHKYKMNLYGFHGGQRVGPVSAVGLTAPGKDEEMA 
PASTEPPTPEPPIKPR1.EELTVTDATPDSLSLSWTVPEGQFDHF 
LVQYKNGDGQPKATRVPGHEDRVTISGLEPDNKYKMNLYGFHGG 
QRVGPVSAIGVTAAEEETPTPTEPSMEAPEPPEEPLLGELTVTG 
SSPDSLSLSWTVPQGRFDSFTVQYKDRDGRPQVVRVGGEESEVT 
VGGLE PGRKYKMHLYGLHEGRRVGP VS T VG VTAPQEDVDETPS P 
TEPGTEAPE PP EE PLLGELT VTGS S PDSLS LS WTVPQGRFDS FT 
VQYKDRDGRPQAVRVGGQESKVTVRGLEPGRKYKMHIiYGXiHEGR 
RLGPVSAIGVT 


5761 


3 


1275 


SCDMAEAAALVW IR3PGFGCKAVRCASGRCTVRDFIHRHCQDQN 
VPVENFFVKCNGAIjINTSDTVQHGAVYSLEPRLCGGKGGFGSML 
RALGAQ I EKTTNREAOTDLSGRRLRDVNHEKAMAEWVKQQAERE 
AEKEQKRUERLQRKLVE PKHCFTS PD YQQQCHEMAERLEDS V1>K 
GMQAASSKMVSAEISENRKRQWPTKSQTDRGASAGKRRCFWI^GM 
EGLETAEGSNSESSDDDSEEAPSTSGMGFHAPKIGSNGVEMAAK 
FPSGSQRARWNTDHGSPEQIjQIPVTDSGRHILEDSCAEIiGESK 
EHMESRMVTETEETQEKKAESKEP1EEEPTGAGLNKDKETEERT 
DGERVAEVAPEERENVAVAKIjQESQPGNAVIDKETIDLLAFTSV 
AELELLGLEKLKCELJVIALGLKCGGTLQ 


5762 


2 


344 


GSTGQTPLHSOGGGGGSGGGRRRTPRGMPKEKYEPPDPRRMYTI 
MSSEEAANGKKSHWAELEISGKVRSLSASLWSIiTHLTALHLSDN 
SLSRIPSDIAKLHNLVYLDIiSSNKIR 


5763 


3 


429 


LDKDTGbl Ml» IARLD YEIiI QR FTI/T I IARDGGGEETTGRVR IN V 
LDVNDNVPTFQKDAYVGALREWE PSVTQLVR LRATDEDS PPNWQ 
ITYSIVSASAFGSYFDISLYEGYGVISVSRPLDYEQISNGLIYL 
TVMAMDAGN 


5764 


19 


441 


VCARACGEMRQLLRP I DRQR YDENEDLS DVEE I VS VRG FSLEEK 
LRSQLY0<3DFVHAM BGKDFKYE YVQREALRVPCjI FRE KDGLGI K 
MPDPDFTVRDVKLLVGSRRLVDVMDVNTQKGTSMSMSQFVRYYE 
TPEAQRDKL 


5765 


3 


825 


QKILRLNWSHQPPTSSSNSKDCGGPASSGAGATAALADGLKFAS 
VQAS APQGNS HKETS KS KVKRS KTS KDANKS LPSAALYGI PE I S 
STGKRQEVQGRPGEATGMNSALGQSVSSGGSGNPNSNSTSTSTS 
AATAGAGSCGKSKEEKPGKSQSSRGAKRDKDAGKSRKDKHDLLQ 
GHQMGSGSQAPSGGHIiYGFGAKSNGGGAS PFHCGGTGSGS VAAA 
GEVSKSAPDSGLMGNSMIiVKKEEEEEESHRRI KKIiKTEKVDPLF 
TVPAPPPHV 


5766 


1 CflR 
JLbUO 




SGLFS VDPASSQAMELSD VTL I EGVGNEVMWAG WVL I LALVI* 
AWLSTYVADSGSNQLLGAIVSAGBTSVLHLGHVDHLVAGQGNPE 
PTELPHPSEGWDEKAEEAGEGRGDSTGEAGAGGGVEPSLEHLIJ) 
IQGLPKRQAGAGSS S PEAPLRSEDSTCLPP S PG LI TVRLKFLND 
TEELAVARPEDT VGALKS KYFPG QESQMKLI YQGRLLQDPARTIj 
RStNITmCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 
SLMVTVFVVLLGVWYFRINYRQFFTAPATVSLVGVTVFFS FIjV 
FGMYGR 


5767 


2 


892 


NFRATPRPPTRPELRTGTEVILWYIJDWRALMKRKRMKANIKLVG 
SGFPLPSSDLDDSLTEEIDEKIGFRNDANFDWQNVADFRDAGGS 
LTEVKVEEEERDPQSPEFEIEEEEEMLSSVIPDSRRENEI*PDFP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline , Q=Glutamine , R=Arginine , 
S=Serine, T= Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X= Unknown , +=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








HIDEFFTLNSTPSRSAYDEPHI*LVNIEKQKLELEKRRLDIEAER 
LQVEKERLQ1EKERLRHLDMEHERLQLEKERLQIEREKLRLQIV 
NSEKPSLENELGQGEKSMLQPQDIETEKLKIiERERLQLEKDRLQ 
FLKFSSEKLQ I EKER LQV EKDRLR I QKEGHLQ 


5768 


3 


476 


SSRSRLSVSVSPPPPGIVELGPPFAWEFCSRLGSAVTSQRAGPA 
AAMYAKDYPFYLTVKRANCSIiELPPASGPAKDAEEPSNKRVKPL 
S R VTS LANL I P P VKAT PLKRFSQTLQRS I S FRS ES RP D I LAPR P 
WSRNAAPSSTKRRDSKLWSETFDVC 


5769 


38 


667 


TKTKKGVKEKATDQSVKAFAEHCPELQYVGFMGCSVTSKGVIHL 
TKLRNLSS LDLRHI TELDNETAME I VKRCKNLI S LNLC LNW UN 
DRCVE VXAKEGQNLKEL YLVS CK I TDYAL I A I GR YSMT I E TVD V 
G WCKE I TDQGAT LI AQ S S KSLRYLGLMRCDKVN E VTVEQLVQQY 
PHITFSTVLQDCKRTLERAYQMGWTPNMSAASS 


5770 


1 


484 


DSRRYDVXTRKWSFLLEEHSKLIAXVRCLPQVQLDPLPTTLTLA 
FASQLKKTSLSLTPDVPEADI^EVDPKLVSNLMPFQRAGVNFAI 
AKGGRLLLADDMGLGKTIQAICIAAFYRKEWPLLVWPSSVRFT 
WBQAFLRWLPSLS PDCINVWTGKDRLT7A 


5771 


168 


741 


GLLPSACljRARSWREASEGPSSRACSNGSQDTFEACYSGTSTPS 
FHGSHCSGS DHS SLGLEQLQD YM VTLRSKLG PLE I QQ PAMLLRE 
YRLGLP IQD YCTGLLKLYGDRRKFLLLGMRP FI PDQDIG YFEGF 
LEG VG I REGG I LTDS FGR I KRSMS S TS ASAVRS YDGAAQR PEAQ 
AFHRLLADITHD I E 


5772 


148 


383 


EFNLALVSPSHPQIKAEDDQPLPGVLLSLSGGLFRSNLLTQDNG 
ILTFSNLVTCSAIYHLPVFPEREPGCSMRDLRVA 


5773 


| 2 


723 


PRVRSKHNFC FMEMNTRLQVEHPVTEM I1X3TDLVEWQLRI AAGE " 
KIPLSQEEITLQGHAFEAR I YAEDPSNNFMPVAGPLVHLSTPRA 
DPS TR I ETGVRQGDEVSVHYDPMI AKLWWAADRQAALTKLRYS 
LRQYNI VGLHTNTDFLLNLSGHPE FEAGNVHTDFI PQHHKQLLL 
SRKAAAKESLCQAALGLI LKEKAMTDTFTLQAHDQFSPFSS SSG 
RRLNISYTRNMTLKDGKNSK 


5774 


2 


592 


FVEEEWIR WRCGGSELN FRRAVFSADS KYI FCVSGDFVKVYST 
VTEE CVH I LHGHRNLVTG I Q LNPNNHLQL YSCS LDGTI KLWDYI 
DGILIKTFIVGCKLHALFTLAQAEDSVFVIVNKEKPDIFQLVSV 
KLP KSS SQB V EAKELS FVLDY INQS PKC IAFGNEGVYVAAVREF 
YLSVYFFKKETTSRVTLSSS 


5775 


3 


538 


SSGCCDPAAPSSLAEAATMPVSKCPKKSESLWKGWDRKAQRNGL" 
RSQVYAVKGDYYVGEWKDNVKHGKGTQVWKKKGAIYEGDWKFGK 
RDGYGTLSLPDQQTGKCRRVYSGWWKGDKKSGYGIQFFGPKEYY 
EGDWCGSQRSGWGRMYYSNGD I YEGQWENDKPNG EGMLRLSQN P 
RP 


5776 


2 


484 


RliPQDCVCQNI^ESLGTLCPSKGLLFVPPDIDRRTVELRLGGNF 
I IHISRQDFANMTGLVDLTLS RNTI SHI QP FS FLDLE S LRS LHL 
DSNRLPSLGEDTLRGLVNLQHL I VNNNQLGGIADEAFEDFLLTL 
EDLDLS YNNLHGPAVGLRGDAW VQPS TS 


Sill 
5778" " 


2 


949 


GQDPEPGQDLFQPEREVDPSWGRGREPRLGKLRFQNDHLSVLKQ 
VXKLEQALKDGSAGLD PQLPGTC YSPHCP PDKAEAGSTLPENLG 
GGSGSEVSQRVHPSDLEGREPTPELVEDRKGSCRRPWDRSLENV 
YRGSEGSPTKPFINPLPKPRRTFKHAGEGDKDGKPGIGFRKEKR 
WLPPLPSLPPPPLPSSPPPSSVNRRLWTGRQKSSADHRKSYEFE 
DLLQSS S ESSRVDW YAQTKLGLTRTLS E ENVYEDILD P PMKENP 
1 fiwiiii^oKUX^KKCVXtNFPAS PTSSI PDTLTKQSLSKpAFFRQ 
KS2RRNV 




1 


1210 


QRRQSVSRLLLPVFLLEPPAEPGLEPPPEEEGGEPAGVAEEPGS 
GGPCWLQLEEVPGPGPLGGGGPLRSPSSYSSDELS PGEPLTS PP 
WAP£X3APERPEHLI*NR VLERLAGGATRDS AAS D ILLDDI VLTHS 
LFLPTEKFLQELHQYFVRAGGMEGPEGLGRKQACLAMLLHFLDT 
YQGLLQEEEGAGH 1 1 KDL YLL IMKDESLYQGLR EDTLRLHQLVE 
TVELKIPEEWQPPSKQVKPLFRHFRRIDSCLQTRVAFRGSDEIF 
CRVYMPDHS YVTIRSRLSASVQD I LGSVTEKLQ YSEEPAGREDS 
LI LVAVSS SGE KVLLQ PTEDC VFTALG INSHLFACTRDS YEALV 
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SBQ 
ID 
NO: 


Predi ctcd 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L= Leucine, M=.Methioninc, N=Asparagine, 
P=Proline, G>Glut amine, R=Arginine, 
S= Serine, T=Threonine, Vss Valine, 
W=Tryptophan , Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLPEEIQVSPGDTEIHRVBPEDVANHLTAFHWELFRCVH3LEFV 
DYVFHGE 


5779 


138 


1571 


EAVQVLIKHSADVNARDKNWQTPLHVAAANKAVKCAEVI IPLLS 
SVNVSDRGGRTALHHAA1.NGHVEMVNLLLAKGANINAFDKKDRR 
AXH WAA YMGHLD WALL INHGAE VTCKDKKG YTPLHAAASNGQ I 
NWKHLLNLG VE I DE I NV YGNTALH I ACYNGQDAWNELIDYGA 
NVNQPNNNG F TPLH FAAAS THGALCLELL VISING ADVN I QSKDGK 
S PLHMTAVHGRFTRS QTLIQNGGEIDCVDKDGNTPLHVAARYGH 
ELI>lNTLITSGAI>TAKCGII-lSMFPLHIiAALNAHSDCCRKLLSSG 
QKYSIVSLFSNEHVLSAGFEIDTPDKFGRTCLHAAAAGGNVECI 
KliL QS S GAD FHKKDKCGRTPIjH YAAANCHFHC I ETLVTTGANVN 
ETDDWGRTALHYAAASDMDRNKTILGNAHDNSEELERARELKEK 
EATLCL3FLLQNDANPS I RDKEGYNS I HYAAAYGHRQCLE LLLE 
RTNSGFEESDSGATKSPLHLAVSEMP 






624 


OPFRVTTCLPFKGPDYRI*YKSEPELTTVAEVDESNGEEKSEPVS 
EIETSWKGSHFPVGWPPRAKS PTPESSTIASYVTLRKTKKMM 
DLRTERPRSAVEQLCLAESTRPRMTVEEQMERIRRHCQACLREK 
KKQLNVIGASDQSPLQS PSNLRDNP 


5781 


19 


941 


RGSLGGHPWRPPMRAASQGCIjPVSFVTGPHQERAYGGRGPGGAF 
PAPPVSGTCPPDL I YAPTPEKAEGGSQKNHQ PP PGERAAHRDGE 
QAPCRAGPTRKVAVAPRPPSCP*GPE\PGEEPRRPLDRSPPLGQ 

vqphftsqdaksaedeapsrhu3khqprsaqvgsru5alqgpkt 
qhsihtvtcksprqkedrspkppqapkhpeehgrqs\qappplp 
vapsrtcggc* twdpallvs p / pqgdstpelpap \qqptggpsr 
crqalppqg* rqqprqrpr / ptgas rsh pakakgcqgppki rny 

NIMD 


5782 


5176 


1237 


DRSMMS MAADS YTDS YTDT YTEAYMV PPLPPEE P PTMP PLP PE E 
PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSEESV 
SQPEPPVSQSEI SEPSAVPTDYSVSASDPSVLVS EAAVTVP EPP 
PEPESS I TLTPVESAVVAEEHEVVPBRP VTCMVS ETPAMS AEPT 
VLASEP PVMSETAETFDSMRASGHVASEVSTSI#I*VPAVTTPVLA 
ESILEPPAMAAPESSAMAVIjESSAVTVLESSTVTVIjESSTVTVL 
E?SWTVPEPPWAEPDYVTIPVPWSAI»EPSVPVLEPAVSV1^Q 
PSMIVSEPSVSVQBS TVTVSEPAVTVSEQTQVI PTEVAIESTPM 
ILESSIMSSHVMKGINLiSSGDQNLAPEIGMQEIAIjHSGEEPHAE 

ehlkgd f ye s ehg in i dlninnhli akemehntvcaagts p vge 
igeekilptsetkqrtvldtypgvseadagetlsstgpfalepd 
atg \ts kg 1 3 fttastlslvn kydvdls lttqdtehdmlistsp 
sggseadiegplpakdihldlpsnini*vssdtneplpvkrd\dq 
tlaali \slxessggekevppps * rehlpdsgfsaniedinead 
lvrp vss prt wnvlps pragl\eg p\ llasdfgp vqnlyss pw 
\ ssmp\erasgs\ ssge kgg\ ye i fvkvkdtheks kknknrdkg 
ekekkrdsslrsrskrskssehksrkltsesrsrarkrssksks 
h rs \qtrsrsrs /rdrrrrssrs rs ksrgrrs vs kekrkrs pkh 
rsksrerkrkrsssrdnrjcrvrarsrtpsrrsrshtpsrrrrsr 

S VGRRRS FS I SPSRRSRTPSRRSRTPSRRSRTPSRRSRTPSRRS 
RTPSRRSRTPSRRRRSRSWRRRSFS IS PVRLRRSRTPIiRRRFS 
RSPIRRKRSRSSERGRSPKRLTDLDKAQLIiEIAKANAAAMCAKA 
GVPLPPNIiKPAPPPTI EEKVAKKSGGATI EEI.TEKCKQI AQSKE 
DDDVIVNKPHVSDEEEEEPPFYHHPFKLSEPKPIFFNLNIAAAK 
PTPPKSQVTbTKEFPVSSGSQHRKKEADSVYGEWVPVEKNGEEN 
KDDDNVFSSNLPSEPVDISTAMSERAIjAQKRLSENAFDLEAMSM 
LNRAQE RI DAWAQLNS I PGQFTGSTGVQVI*TQEQIiANTGAQAWl 
KKDQFLRAAPVTGGMGAVLMRKMGWREGEGIiGKNKEGNKEPJLV 
DFKTDRKGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICN 
KRRWQPPEFLLVHDSGPDHRKHFLFRVLINGSAYQPNCMFFLNR 
Y 


5783 


1693 


698 


DSGLRVAFTMEGISNFKTPSKLSEKKKSVLCSTPTINIPASPFM 
QKLGFGTGVNVYZjMKRS prglshs PWAVKKINPICNDHYRSVYQ 
KRLMDEAKILKSLHHPNIVGYRAFTEANDGSLCIiAMEYGGEKSL 
NOLI EE / PI * SQ/ PKI tiFQQP/Ii I LKVALNMARGLKYLHQEKKL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide"' 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
** -nisnaine, ± — J. sojl eucine , K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHGDIKSSNWI KGDFETI KICDVGVSLPLDENMTVTDPEACYI 
GTEPWKPKEAVEENGVITDKADI FAFGLTLWEMMTLS I PHINI>S 
NDDDDEDKTFDESDFDDEAY YAALGTRPP INMEELDES YQKVI E 


5784 


2669 


13 8 8 


PRVRPRVRTDHNYYISRIYGPSDSASRDLWVNIDQMEKDKVKIH " 
GILSNTHRQAARVNI>SFDFPFYGHFIjREITVATGGFIYTGEWH 
RMLTATQ Y I APLMANFDPS VSRNSTVRYFDNGTALVVQWDHVHL 
QDNYNLGSFTFQATLLMDGRI IFGYKBI PVLVTQISSTNHPVKV 
GLS DAFVWHRI QQ I PNVRRR7 I YEYHR VELQMS KI TNI S AVEM 
TPLPTCLQFNRCG PCVSSQI G FNCSWCS KLQRCS SG FDRHRQDW 
VDS GCFEES KE KMCE NTEPVET \ FLEPPQP * 2RQPPSSGS*LPP 
E / DAVTSQFPTSLPTE DDTKI ALHLKDNGAS TDDS AAE KKGGTL 
HAGLIVGILILVLIVATAILVTVYMYHHPTSAASIFFIERRPSR 
W PAH KFRRG SGH PA YAE VEP VGEKEGF I VSEQ C 


5765 


2669 


1388 


PR VR PRVRTDHNYY I S RI YGPSDS ASRDLWVNI DQMEKDKVK I H 
GI LSNTHRQAARVNLS FDFPFYGH FLRE I TVATGGFI YTGEWH 
RMLTATQ Y IAP EjMANFD PS VSRNSTVRYFDNGTALVVQWDHVHL 
QDNYNLGS FTFQATLLMDGRI I FGYKE I PVLVTQISSTNHPVKV 
GLSDAFVWHRIQQI PNVRRRTI YEYHRVELQMSKITNI SAVEM 
TPLPTCLQFNRCGFCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 
VDSGCPEESKEKMCENTEPVET\FLEPPQP*ERQPPSSGS*LPP 
E/ DAVTSQ FPTSL PTEDDTK IALHLKDNGASTDDSAAEKKGGTL 
KAGL I VGI LI LVLrVATAILVTVYMYHHPTSAASIFFI ERRPSR 
W PAMK FR RGSGHP AYAEVE PVGEKEG FI VSEQC 


5786 


2532 


1674 


S YKL PAAERRASS CSQ P PTPTRRRW PAPGRTS RGHRPQM * SGTP 
APRPPARSTVSPASPLPKPRAGRCGSRPRSACSTFRPC*SLN*M 
S *H * KRNLSQRSSSMSRRPLSCARPHR* *RQGLTVAARLPTWAK 
SPPLACSFCQAAQKSQSLSSGRSTR* PERMS FRP\SPPGNPAIP 
SLAPSSRP/PKGRPQCTWIPSRWPASPTAPPTTT*APTSSPGST 
GRSMMTCPTRWTATPWSARASSRPRNWPTP * V7RPSGRLSTV* RA 
TGGS TATAP P KRFPRNWNPMMAE 


5787 


2 


1460 


MAS AASVTS LADE VNCP \ I CQGTLKEAGSLSNCG/HKNFCRACL 
T\RYCEIP\GPD\LEESP\TCP\LCKEPFRP\GSFRPNWQLANV 
VEN I ERLQLVS TLGLGEEDVCQEHGEKI YFFCEDDEMQLCWCR 
EAGEHATHTMRFLEDAA\AP YREQIHKCLKCLI KER EE I QE I QS 
RENK31MQVLLTQVSTKRQQVISEFAHLRKFLEEQQSILLAQL3S 
QDGDI LRQRDE FDLLVAGE I CRFS ALI EELEEKNER PARBLLTD 
IRSTLIRCETRKCRKPVAVS PELGQRI RDFPQQALPLQRBMKMF 
LEKLCFELDYBPAHISLDPQTSHPKLLLSEDKQRAQFSYKWQNS 
PDNPQRFDRATCVLAHTG I TGGRHTWWS I DLAHGGSCTVGWS 
EDVQRKGELRLRPEEGVWAVRLAWGFVSALGSFP\TRLTLKEQP 
RQVRVSLDYEVGWVTFTNAVTREPIYTFTASFTRKVIPFFGLWG 
RGSSFSLSS 


5788 


2 


6860 


EHS VSGRS S A YGDATAEGHPAGPGS VS SSTGAI S TTTGHQEGDG 
SEGEGEGETEGDVHTSNRLHMVRLMLLERLLQTLPQLRNVGGVR 
AI PYMQVI LMLTTDLDGEDE KD KGALDNLLSQL I AELGMDKKDV 
SKKNERSALNEVHLWMRLLSVFMSRTKSGSKSSICESSSLISS 
ATAAALLS SGAVD YCLHVL KSLLEY WKSQQNDEE P VATSQLLKP 
HTTSS P PDMSP FFLRQ YVKGHAADVFEAYTQLLTEM VLRLP YQ I 
KKITDTNSRIPPPVFDHSWFYFLSEYLMIQQTPFVRRQVRKLLL 
FICGSKEKYRQLRDLHTLDS \HVRGI KKLLEEQG I FLRASWTA 
S PQSALQYDTLI SLMEHLKACAEI AAQRTINWQKFC I KDDS VLY 
FLLQVS FLVDEGVS P VLLQLLS CAL CG S KVLRALAAS SGSSSAS 
SS PAP VAAS SGQATTQS KSSTKKS KKEEKEKEKDGETSGSQEDQ 
LCTAL VNQLNKFADKETLIQFLRC FLLESNS SSVRWQAHCLTLH 
I YRNSS KSQQELLLDLMWS I WPELPAYGRKAAQFVDLLGYFSLK 
TPQTEKKLKEYSQKAVEILRTQNHILTNHPNSNIYNTLSGLVEF 
DGYYLES DPCLVCNNPEVPFCYI KLSS I KVDTRYTTTQ QWKL I 
GSHTISKVTVKIGDLKRTKMVRTINLYYNNRTVQAIVELKNKPA 
RWHKAKKVQLTPGQTEVKIDLPLP I VASNLMIEFADFYENYQAS 
TETLQCPRCSASVPANPGVCGNCGENVYQCHKCRS INYDEKDPF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methion.i.ne, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LCNACGFCKYARFDFML YAKPCCAVDP I ENEEDRKKAVSN I NTL 
LDKADRVYHQLMGHR PQLENLLCKVNEAAPEKPQDDSGTAGGI S 
STS ASVNRYI LQLAQEYCGDCKNS FDELSKI IQKVFASRKELLE 
YDLQQREAATKSSRTSVQPTFTASQYRALSVLGCGHTSSTKCYG 
CAS AVTEHC ITLLRAIiATN PALRH I LVS QGL I REI* FDYNLRRGA 
AAMREE VRQLMCLLTRDNPEATQQMNDL I IGXVSTALKGHWAN P 
DIASSLQYEMLLLTDS I SKEDS CWELRLRCAliSLFLMAVNI KTP 
VWENITJuMCLRILQKLI KPPAPTS KKNKDVPVEALTTVKPYCN 
E1HAQAQLWLKRD P KAS YDAWKKCLP IRG I DGNGKAPS KS ELRH 
LYL.TEKYVWRWKQPLSRRGKRTSPLDLKLGHNNWLRQVLFTPAT 
QAARQAACT I VE ALAT I PSRKQQVLDLLTS YLDELSIAGECAAE 
YLALYQKL~TSAHWKVYLAARGVLPYVGNIiITKEIARLL»ALEEA 
TLSTDLQQGYALKSLTGLLSSFVEVESIKRHFKSRLiVGTVIjNGY 
LCLRKLVVQRTKliIDETQDMIiI*EMIjEDMTTGTESETKAFMAVCI 
ETAKRYNLDDYRTPVFIFERLCSIIYPEENEVTEFFVTLEKDPQ 
QEDFLQGRMPGNPYS SNEPGIGPLMRDI KNKI CQDCDLVALLED 
DSGMELLVNNKI I SLDLPVAEVYKKVWCTTNEGEPMRI VYRMRG 
LLGDATEEFIESLDSTTDEEEDEEEVYKMAGVMAQCGGLECMLN 
RIAGIRDFKQGRHLLTVLLKLFSYCVKVKVNRQQLVKLEMNTLN 
VMLGTLNLALVAEQES KDS GGAAVAEQ VLS I MEI \ IQAEPNVEP 
LSEDKGNLX>LTGDKDQLVMIiIJDQINSTFVRSNPSVLCGI*liRIIP 
YLS FGEVEKMQI LVBRFKP YCNFDKYDEDHSGDDKVFL\DCFCK 
IAAG I K\NNSNGHQL\ KD1>\ I LQKGITQNALD\ YMKKHI P/SAA 
RI WDADI \WKS FCLRPALPFILRLLRGLAI QH PGTQVL IGTDSI 
PNLHKLEQVS \SDEG IGTLA\ENL\LESLREHPDVNKKIDA\AR 
RETRAEKJCKMAMAMRQK1A1^TLG\MTTNEKGQVVD/TRTALLEA 
DWEELI EEP\GL,TCCICREG YKFQPTKVLG I YTFTKRWLGGVW 
ENKPRETSRATSTVSHFNI VHYDC \HLA\AVS LARGREEWESAA 
LQNANTKCNGLLPVWGPHVPESAFATCLARHNTYLQECTGQREP 
TYQLtflHDI KLLFLRFAMEQSFSADTGGGGRESNIHLI PYI IHT 
GLYVIjNTTRATSREEKNLOGFIiEQPKEKWVESAFEVDGPYYFTVr 
LALHI LPPEQWRATRVEI LRRLLVTSQARAVAPGGATRLTDKAV 
KDYSAYRSSLLFWALiVDLIYNMFKKVPTSNTEGGWSCSIiAEYIR 
HNDMPIYEAADKALKTFQEEFMPVETFSEFUDVAGLLSEITDPE 
SFLKDLUtfSVP 


j 5789 


1 


2407 


LPLHAVEKTGRPGQPAIjKMPGKLRSDAGLESDTAMKKGETLRKQ 
TEEKEKKEKPKSDKTEEIAEEEETVFPXAKQVKKKAEPSEVDMN 
SPKSKKAKK\KEEPSQNDISPKTKSLRKKKEPIEKKVVSSKTKK 
VTKNEEPSEEEIDAPKPKKMKKEKEMNGETREKSPKLKNGFPHP 
EPDCNPS EAASEESNSEIEQEI PVEQKEG\ AFSNFP ISEETI KIj 
LKGRGVTFIjFP I QAKTFHHVYSGKDI* I AQARTGTGKTFSFAI PI* 
IEKLHG\ELQDRKRGRAPQVLVIAPTRELANQVSKDFSDITKKI, 
S VACFYGGTP YGGQ FERMRNG I DILVGTPGRI KDHI QNGKLDLT 
iOiNHVVLDEVDQMLDMGFADQVEEILSVAYKKDSEDNPQTLLFS 
ATCPWWFNVAKKYMKSTYEQVDLIGKKTQKTAITVEHIiAI KCH 
WTQRAAVI GDVI RVYSGHQGRTI I FCETKKEAQELSQNSAI KQD 
AQSLHGD I PQKORE I TIiKGFRNG S FGVIjVATNVAARGLD I PEVD 
LVIQS S PPKDVE S Y I HRSGRTGR AGRTG VCI CFYQHKEEYQLVQ 
VEQKAGIKFKRIGVPSATEI IKASSKDAI rlldsvpptaishfk 
QS AEKL IEEKGAVEALAAALAHI SGATS VDQRSL INSHVGFVTM 
ILQCSIEMPNISYAWKELKEQLGEEIDSKVKGMVFLKGKLGVCF 
DVPrASVTEIQBKWHDSRRWQLSVATEQPEliEGPREGYGGFRGQ 
REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSNRSQNK 
GQKRS FS KAFGQ 


5790 


3786 


1585 


ARRQRDPLQALRRRNQELKQQVDSLLSESQLKEALEPNKRQHIY 
QRC I QLKQAI DENKNALQKIiSKADES AP VAN YNQRKEEEHTULD 
KLTQQLQGIiAVTISRENITEVGAPTEEEEESESEDSEDSGGEEE 
DAEEEEEEKEENESHKWSTGEE Y I AVGDFTAQQ VGDLTFKKGE I 
LLVIEKKPDGWWIAKDAKGNEGLVPRTYLEPYSEEEEGQESSEE 
GSEEDVEAVDETADGAEVK\QRTDPHWSAVQKAI SEAG I FCLVN 
HVS FCYL I VliMRNRMETVEDTNGS ETGFRAWNVQSRGR I FLVSK 
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SEQ 
ID 
NO: 



5791 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5792 



5793 



5794 



2263 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1636 



653 



2263 



653 



5016 



Amino acid segment containing signal peptide" 
(A=Alanine, C= Cysteine, D^Aspartic Acid, e« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
! L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 

PVLQQINTVDVLTTMGAIPAGFRPSTLSQLLEEGNQFRANYFLQ 
P ELMPS QLAKRDLMWD ATEGTI RSRPSR I SLILTLWS CKM I PL P 
GMS IQVLSRHVRLCIjFDGNKVIiSNIHTVRATWQPKKPKTWTPS P 
QVTR IL P CLLDGDCFI RSNS AS PDLG I L FELG I S Y I RNSTGERG 
ELS CGWVFL KLFDAS G VP I PAKT YELFLNGGTPYE KG I E VD P S I 
SRRAHGSVFYQIMTMRRQPQLLVKLRSLNRRSRNVLSLLPETLI 
GNMCSIHLLIFYRQILGDVLLKDRMSLQSTDLISHPMLATFPML 
LEQPDVMDALRSSWAGQES\TLKRSEKR\PK2FLKVPRFLLVYH 

Xgcvlpll/htptrlppfrwaeeetetarwkvitdflkqnqenq 
galqalls ppg vhep fdls eqtydf lgemrknav 

LRviusb'AGTSR/ IGAGLIQPLHRAPARDHGLZjRGGAAPALSVSH " 
GN/GKQL/AMSSQGSDDEQIKRENIRSLTMSGHVGFESLPDQLV 
NRSIQQGFCFNILCVGETGIGKSTLIDTLFNTNFEDYESSHFCP 
NVKLKAQTYELQESNVQLKLT I VNTVGFGDQ INKERS YQ?I VD Y 
IDAQFEAYLQEELKIKRSLFTYHDSRIHVCLYFISPTGHSLKTL 
DLLTMKNLDSKVYI I PVIAKADTVS KTELQKFKIKLMSELVSNG 
VQIYQFPTDDDTIAKWAAMNGQLPFAVVGSMDEVKVGNKMVKA 
RQYP WGVVQVENEKHCDF VKLREML I CTNMEDLREQTHTRH YEL 
YRRCKLEEMGFTDVG PENKP V£ VQET YEAKRHEFHGERQRKEE B 
MKQMFVQRVKEKEAILKEAERELQAKFEHLKRLHQEERMKLEEK 
RRLLEEEI I AFS KKKATS E I FHSQS FLATGSNLRKDKDRKNSQF 
FVKQKVPEHRRSSSQAWFIKKKLEVCFDFAVICFITS I FGEQPQ 
LLI FME KY FQVQGQY J S QS E 

AAAAPSPAWWCGVFWYWHTCWVMYGIVyTRPCSGDASCroPY 

T.TlDDDVmT \ ntifTOTtnnmn.rT 



— ^ » «««« v.r« vi uuj. v i An.rv.oijUrtiv.iyfx 

IARRPKI^L\RH^FTTTRSHLGAENN I DLVLNVE DFD VBS KFER 
TVNVS VPKKTRNNGTL YAYI FLHHAG VLPWHDGKQVHLVSPLTT 
YMVPKPEEINLLTGESDTQQIEADKKPTSALDEPVSHWRPRLAL 
NVMADNFVFDGSSLPADVHRYKKMIQLGKTVHYLPILFIDQLSN 
RVKDLMVINRS TTEL PLTVS YDKVS LGRLRFW I HMQEAVYS LQQ 
FGFSEKDADEVKGIFVDTNLYFI^ALTFFVAAFHLLFDFIAFKND 
ISFWKKKKSMIGMSTKAVLWRC FSTWI FLFLLDEQTSLLVLVP 
AGVGAAI ELWKVKKALKMTI FWRGLMPEFQFGTYS ESERKTE EY 
OTQAMKYLSYLLYPLCVGGAVYSLLNIKYKSWYSWLINSFVNGV 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 

IITMPTSHRIACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 

AAAAPSPAWWCGVFWYVVHTCWVMYGIVYTRPCSGDASCIQPY 
LARRPKIiQL\RHSFTTTRSHLGAEIWIDLVLKVEDFDVESKFER 
TVNVS VP KKTRNNGTLYAY I FLHHAG VLPWHDGKQVHLVS PLTT 
YMVPKPEE INLLTGESDTQQ I EADKKPTSALDEP VSHWRPRLAL 
NVMADNFVFDGS S LPADVHRYMKMI QLGKTVH YLP ILFIDQLSN 
RVKDLMVINRSTTELPLTVS YDKVS LGRLRFWI HMQDAVYSLQQ 
FGFSEKDADEVKGIFVDTNLYFLALTFFVAAFHLLFDFLAFKND 
I SFWKKKKSMIGMSTKAVLWRCFSTVVI FLFLLDEQTSLLVLVP 
AGVGAAIELWKVKKALKMTIFWRGLMPEFQFGTYSESERKTEEY 
DTQAMKYZiSYLLYPLCVGGAVYSLLNIKYKSWYSWLriVSFVNGV 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 

1 1 TMPTSHRIACFRDDVVFLVYLYQRWLYPVDKRRVNEFGES YE 
EKATRAPHTD 

MGPRLSVWLLIaLPAALLLHEEHSRAAAKGGCMSGCGKCDCHGV " 

KGQKGERGLPGLQGyiGFPGMQGPEGPQGPPGQKGDTGEPGLPG 

TKGTRG P PGASG YPGNPGL PG I PGQDGP PGP P3 1 PGCNGTKGER 

GPLGPPGLPGFAGNPGPPGLPGMKGDPGEILGHVPGMLLKGERG 

FPGIPGTPGPPGLPGLQ3PVGPPGFTGPPGPPGPPGPPGEKGQM 

GLS FQG PKGDKGDQG VSGPPGVPGQAQVQE KGDFATKGEKGQKG 

EPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 

YPGLIGRQGP\QGEKGEAGPPGPPGIVIGTGPLGEKGERGYPGT 

PGPRGEPGPKGFPGLPGQPGPPGLPVPGQAGAPGFPGERGEKGD 

RGFPGTSLPGPSGRDGLPGPPGSPGPPGQPGYTNGIVECQPGPP 

GDQGPPG I PGQPGFIGE IGEKGQ KGESCLICD IDGYRGPPGPQG 

PPGEIGFPGQPGAKGDRGLPGRDGVAGVPGPQGTPGLIGQPGAK 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N^Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /—possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEPGEFYFDLRLKGDKGDPGFPGQPGMPGRAGSPGRDGHPGLPG 
PKGSPGSVGLKGERGPPGGVGFPGSRGDTGPPGPPGYGPAGPIG 
DKGQAGFPGGPGSPGLPGPKGEPGKIVPLPGPPGAEGLPGSPGF 
PGPQGDRGFPGTPGR\PGL\PGEKGAVG\QPGIGFPGPPGPKGV 
DGLPGDMGPPGTPGRPGFNGLPGNPGVQGQKGEPGVGLPGl.KGIi 
PGLPG I PGTPGEKGS IGVPGVPGEHGAIGPPGLQGIRGEPG P PG 
LPGSVGSPGVPGIGPPGARGPPGGQGPPGLSGPPGIKGEKGFPG 
FPGLDMPGPKGDKGAQGLPGITGQSGLPGLPGQQGAPG I PGFPG 
SKGEMGVMGTPGQPGS PGPWGAPGLPGEKGD \HGFPGS SGPRGD 
P3LKGDKGDVGLPGKPGSMDKVYMGSMKGQKGDQGEKGQIGPIG 
EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 
GPKGSVGGMGLPGTPGEKGVPGI PGPQGSPGLPGDKGAKGE KGQ 
AGPPGIGIPGLRGEKGDQGIAGFPGSPGEKGEKGSIGIPGMPGS 
PGLKGS PGSVG YPGSPGLPGEKGDKGLPGLDGI PGVKGEAGLPG 
TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 
DKGSKGEVGFPGIiAGSPGIPGSKGEQGFMGPPGPQGQPGLPGSP 
GHATEGPKGDRGPQGQPGLPGLPGPMGPPGLPGIDGVKGDKGNP 
GWPGAPGVPGPKGDPGFQGMPGIGGSPGITGSKGDMGPPGVPGF 
QGPKGLPGLQGIKGDQGDQGVPGAKGLPGPPGPPGPYDIIKGEP 
GLPG PEG P PGLKG LQGL PG PKGQQGVTGLVG I PG P PG I PG FDGA 
PGQKGEMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTPSVDHGF1* 
VTRHSQT IDDPQCPSGTKI LYHGYS LLYVQGNERAHGQDLGTAG 
S CLRKFS TMPFLFCN I NNVCNFASRND YS YWLSTPEPMPMSMAP 
I TGENIRPFISRCAVCEAPAMVMAVHSQTIQIPPCPSGWSS LW I 
G YS FVMHTS AGAEGSGQAIiAS PGSCLE E FRS AP F I ECHGRGTCN 
YYANAYS FWLATI ERSEMFKKPTPSTLKAGELRTHVSRCQVCMR 
RT 


579S 


1X92 


61 


STRSPTVE YI SAHPHI LFMbliKG YEAPQ IAIjRCG I MLRECI RHE 
PLAKI II*FSNQFRDFFKYVELSTFDIASDAFATFKDLI*TRHKVI» 
VADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLLGELI LDRHN 
FAI MTKYI SKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVAS PH 
KTQPIVE IL.LKNQPKLIEFLSSFQKERTDDEQFADEKNYLI KQI 
RDLKKTAP * RALRDS KR 


5796 


2 


1078 


GRVGWELWCMY ISPPKDWWDAGDPSLP IRTPAMIGCS FWNRKF 
FGEIGBLDPGMDVYGGENIELGIKVWLCGGSMEVLPCSRVAHIE 
RKKKPYNSNIGFTTKRNA1JIVAEVWMDDYKSHVYIAWWLPLEKP 
G I DIGDVSERRALRKS LKCKNFQ W YLDHVYPEMRR YNNT VAYGE 
LRNNKAKDVCbDQGPLENirrAI LYPCHG WG PQLAR YTKEG FLHI* 
GALGTTTIiLPDTRCLVDNSKSRLPQbLDCDKVXSSLYKRWNFIQ 
NGAIMNKGTGRCLEVENRGLAGIDLII^SCTGQRWTIKN3IK*R 
EGAGALEPGPQDMAAPPNIWTSCPGGETARGRQVTiDGPPRASPG 
QHRDPG 


5797 


2 


891 


PRVRQKTLVDVTLENSNI KDQI RNLQQTYEASMDKLREKQRQLE 
VAQVENQLLKMKVESSQEANAEVMREMTKKIjYSQYEEKLQEEQR 
KHSAEKEALLEETNS FLKAIEEANKKMQAAE I SLEEKDQRI GEL 
DRLIERMEKERHQLQLQLLEHETEMSGBLTDSDKERYQQLEEAS 
ASLRERIRHLNDMVHCQQ KKVKQMVEE I ES LKKKLQQKQJbL I LQ 
LLEKISFLEGENNELQSRLDYIjTETQAKTEVETREIGVGCDIiLP 
SOTGRTREIVMPSRKYTPYTRVIiELTMKKTIiT 


5796 


\ 644 


115 


KILGSRWKSMSNQEKQPYYEEGARLSKIHLEKYPNYKYKPRPKR 
TCIVDGKKLRIG3YKQLMRSRRQEMRQFFTVGQQPQI P ITTGTG 
WYPGAI TMATTTPS PQMTS DCSSTSASPEPS LPVIQSTYGMKT 
DGGSLAGNEMINGEDEMEMYDDYEDDPKSDYSSENEAPEAVSAN 


5799 


2679 


1435 


LLST Y I KFINIiFPETKAT I QGVLRAGSQ LRNAD V£ LQQRAVE YL 
TLSSVAS TD VLATVLEEMPPFPERES S I LAKLKRKKGPGAGS AL 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGIjRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQ PSLGPTPEEAFJLS PG P EDIGPP I P 
EADELLNKFVCKNNGVLFENQLLQIGVKSEFRQNLGRMYLFYGN 
KTSVOFQNFSPTVVHPGDLQTQLAVQTKRVAAQVDGGAQVQQVI, 
HIECLRDFIjTPP1»LSVRFRYGGAPQALTLKLPVT1I7KFFQPTEM 
AAQDFFQRWKQLSLPQQEAQKIFKANHPMDAEVTKAKIj1/3FGSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I~Isoleucine, K=Lysine, 
L=Leucinc, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y= Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLDNVDPNPENFVGAG 1 1 OT KALQVGCLLRLE PNAQAQM YRLTL 
RTSKEPVSRHLCELLAQQF 


5800 


2679 


1435 


LLSTYIKFINLFPETKATIQ^VLRAGSQLRNADVELQQRAVEYL 
TIiS S VASTD VLATVLE3MPP FPERESS I LAKLKRKKG PGAGS AL 
DDGRRJDPSSNDINGGMSPTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPEEAFLSPGPEDIGPPIP 
EADELIiNKFVCKNNGVLFENQLLQIGVKSEFRQNIjGRMYIiFYGN 
KTS VQFQNFS PTWH PGDL^TQLAVOTKR VAAQVDGG AQVQQVL 
NIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDFFQRWKQLSLPC^EAQKIFKANHPMDAEVTKAKLLGFGSA 
LLDNVDPNPENFVGAG I I QT KALQVGCLLRLE PNAQAQMYRLTL 
RTSKEPVSRHLCELLAQQF 


5801 


3 


1413 


FPRLYHLI PDGEITS I KINRVDPSESLS IRLVGGSETPLVHI 1 1 
QHI YRDGVI ARDGRLLPGDI I LKVNGMD I SNVPHN YAVRLLRQP 
CQVLWLTVMREQKFRSRNNGQAPDAYR PRDDS FHVILNKSS PEE 
QLG T KLVRKVDEPGVFI FNVLDGGVAYRHGQLEENDRVLAINGH 
DLRYGS PESAAHLI QAS ERRVHLWSRQVRQRSPDI FQEAGWNS 
NGSWSPGPGERStTTPKPLHPTITCHEKVVNlQKDPGESLGMTVA 
GGAS HREWDLP I YVI S VE PGG V I SRDGR I KTGD I LLN VDG VELT 
B VS RS EAVALLKRTSS S I VL KALE VKE YE PQEDCSS P AALDSNH 
NMAPPSDWSPSWVMWLELPRCLYNCKDIVLRRNTAGSLGFCIVG 
G YEE YNGNKPFFI KS I VEGT PAYNDGR I RCGD ILLAVNGRSTSG 
MIHACLARLLKELKGRI TLTIVSWPGTFL 


5802 


3 


290 


CFS L YQI MER I MDL PTLLRHAFREMFS VGGLF WMFR I RI ILCLM 
GAFFYLISPLDFVPEALFG1LGFLDDFFVIFLLLIYISIMYREV 
ITQRLTR 


5803 


2234 


1299 


EAQ FGTTAE I YAYREEQDFGIE I VKVKAIGRQRFKVLELRTQSD 
G IQQAKVQI LP3CVLPSTMSAVQLESLNKCQI FPSKPVS REDQC 
SYKWWQKYQKRKFHCANLTSWPRWLYSLYDAETLMDRIKKQLRE 
WDENLKDDSLPSNPIDFSYRVAACLPIDDVLRIQLLKIGSAIQR 
LRCELDIMNKCTSLCCKQCQETEITTKNEIFSLSLCGPMAAYVN 
PHGYVHETLTVYKACNLNLIGRPSTEHSWFPGYAWTVAQCKICA 
SH I GWKFTATKKDKS PQKFWGLTRSALLPTI PDTEDEIS PDKVI 
LCL - 


5B04 


2 


1707 


EMEKQRQEEQRKRTEEERKRRIEQDMLEKRKIQRELAKRAEQIE ' 

D INNTGTES ASEEGDDSLLITWP VKS YKTSGKMKKN FEDLE KE 

REEKERIKYEEDKRIRYEEQRPSLKEAKCLSLVMDDEIESEAKK 

ESLSPGKLKLTFEELERQRQENRKKQAEEEARKRLEEEKRAFEE 

ARRQMVNEDEENQDTAKI FKGYRPGKLKLSFEEMERQRREDEKR 

KAEEEARJRRIEEEKKAFAEARR13MVVDDDSPEMYKTISQEFLTP 

GKLEINFEELLKQKMEEEKRRTEEERKHKLEMEKQEFEQLRQEM 

GEEEEENETFGLSREYEELIKLKRSGSIQAKNLKSKFEKIGQLS 

EKEIQKKIEEERARRRAIDLEIKEREAENFHEEDDVDVRPARKS 

EAPFTHKVNMKARFEQMAKAREEEEQRRIEEQKLLRMQPEQREI 

DAALQKKREEEEEEEGS I MNGSTAEDEEQTRSGAPWFKKPLKNT 

SVVDSEPVRFTVKVTGEPKPEITWWFEGEILQDGEDYQYIERGE 

TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIESKN 


5805 


3 


776 


YISDTLGQVYKSKIRWWIEENGGNGNISVDDLIALLDLAEHASS 
AFKESQQQS BDRE YE VKERLYPKSKRRYDTYNIAGYQGE I EVGL 
YTIQILQLIPFFDNKNELSKRYMVNFVSGSSDIPGDPNNEYKLA 
LKNYI PYLTKLKFSLKKSFDFFDEYFVLLKPRNNI KQNEEAKTR 
RKVAGYFKKYVDIFCLLEESQNNTGLGSKFSBPLQVERCRRNLV 
ALKADKFSGLLEYLIKSQEDAISTMKCIVNEYTFLLK 


5806 


1257 


877 


AVFTFHNHGRTANLYSLHSWLGITTVFLFACQRFLGFAVFLLPW 
ASMWLRSLLKPIHVFFGAAILSLSIASVISGINEKLFFSLKNTT 
RP YHS LPSEAVFANSTGMLWAFGLLVLYILLAS SWKRP 


5807 


2267 


1302 


RFS KKTFRRPMAVDIQPACLGLYCGKTLLFKNGSTE I YGECGVC 
PRGQRTNAQKYCQPCTESPELYDWLYLGFMAMLPLVLHWFFIEW 
YSGKKSSSALFQHITALFECSMAAI ITLLVSDPVGVLYIRS CRV 
LMLSDW YTMLYNPS PDYVTTVHCTHEAVYPLYTI VF I YYAFCLV 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=IsoIeucine, K^Lysine, 
L=Leucine, M=Methionine, N*=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyxosine, x=Unknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMMLbRPL^VKKIACGLGKSDRFKSIYAALYFPPILTVLQAVGG" 
GLLYYAFPY I ILVLSLVTLAVYMSASEI ENCYDLLVRKKRLI VL 
FSHWLLHAYG 1 1 S ISRVDKLEQDLPUUALVPTPALF YLFTAKFT 
EPSRILSEGANGH 


S808 


2 


433 


SLPDSG WE YLjSNGGVADNHKDFG E bR YNE CLMNFS CNGKNG S S 
EGRITHGF0LKSAYEWNU«1PYTNYTFDFKGVIDYIFYSKTHMNV 
LGVLGPLDPQWLVENW ITGCPHPHI PSDHFSLLTQLEbHPPLLP 
LVNGVHLPNRR 


5809 


464 


2422 


I LVPGFQG I LHPGVY CALQSQHQAQELVADI DECE VSGLCRKGG 
RCVNTHGSFECYCHDGYbPRNGPEPFHPTTDATSCTE 1 DCGTPP 
EVPDGY I IGNYTSSbGSQVRYACREGF FS VPEDTVSSCTGLGTW 
ES PKLHCQE INCGNP PEMRHAILVGNHS SRLGGVARY VCQEGFE 
SPGGKITSVCTEKGTWRESTLTCTEILTKINDVSLFNDTCVRWQ 
INSRRINPKI S YVIS I KGQRLDPMESVREETVNLTTDSRTPEVC 
LALYPGTNYTVNISTAPPRRSMPAVIG FQTAEVDLLEDDGS FN I 
S I FNE^CLKLNRRSR KVGSEHhl YQFTVLGQRW YliAN FS HATSFN 
FTTREQVP VVCLDIiY PTTDYTVNVTLLRS P KRHS VQ I T I ATP PA 
VKQTISNISGFNETCliRWRSIKTADMEEMYLFHIWGQRWYQKEF 
AQEMTFNI SSS SRDPEVCbDLRPGTNYNVSLRAbSSELPWISL 
TTQ I TEP PbPE VEF FT VHRGPLPRLRLRKAKE KNGP I S S YQVLV 
LPLAbQS TFS CDSEGAS S FFSNAS DADGYVAAELLAKDVPDDAM 
EIPIGDRLYYGEYYNAPLKRGSDYCIILRITSEWNKVRRHSCAV 
WAQVKDSS I*MI>LQMAGVGLGSbAWI II/rFLS FSAV 


5810 


3 


1641 


KVFGTHKDHE VSTLDTAI S AVKVQIiAEFLENliQEKSLR IEAFVS 
B I ES FFNT I EENCS KNE KRUSEQNEEMMKKVLAQYDE KAQS FE E 
VKKKKME FbHEQMVHFLQS MDTAKDTLETI VRE AEELDEAVFLT 
S FEE INERLIiSAMESTASLEKMPAAFSLFEHYDDSSARSDQMLK 
QVAVPQP PRXEPQEPNS ATSTT I AVY WSMNKEDVI DS FQVYCME 
EPQDDQEVNELVEEYRLTVKES YCI FEDLEPDRCYQVWVMAVNF 
TGCS LPSERAI FRTAPS TP VTRAED CTVCWNTAT I RWR PTTPEA 
TETYTLEYCRQHS PEGEGLRSFSG I KGLQLKVNLQPNDN Y FFYV 
RAINAFGTSEQSEAALISTRGTRFLLLRETAHPALHISSSGTVI 
SFGERRRLTE I PS VLGEELPSCGQHYW ETTVTDCP AYRLG I CSS 
SAVQAGALGQGETSWYMHCSEPQRYTFFYSGIVSDVHVTERPAR 
VGILLDYNNQRLIFINAESEQLLFIIRHRFNEGVHPAFALEKPG 
KCTLHLGIBPPDSVRHK 


5811 


1918 


851 


AAAliADPLP EDKWS AEKRR PLKSS LGY E ITFS LLNPDP KSHD VY 
W D I EGAVRR YVQ P FLNALG AAGNFS VDSQ I L YYAM LGVNP RFDS 
ASSSYYLCMHSLPHVINPVESRIXSSSAASLYPVLNFLX.YVPELA 
HSPL YI QDKDGAPVATNAF1I5 PRWGGI M VYNVDS KT YNASVLPV 
RVEVDMVRVMEVFLAQbRLLFGIAQPQbPPKCLbSGPTSEGLMT 
WELDRLLWARS VENLATATTTLTS LAQLbGKI SN I VI KDDVASE 
VYKAVAAVQKSAEEIiASGHIiASAFVAS QEAVTS S ELAFFDPSLI* 
HLLYFPDDQKFAI YI PLFLPMAVP I LLSLVKI FLETRKSWRKPE 
KTD 


5812 


5204 


2744 


GGRQRCQRGRSCGAREEEVEPGTARPPPAASAMDASLEKIADPr 
XiAEMGiWLKEAVKMDEDSQRRTEEENGKJCLISGDrPGPLQGSGQ 
DMVS I LQbVQNbMHGDEDEE PQSPRIQNIGEQGHMALLGHSLGA 
YI STbDKEKLRKbTTR I LSDTTLWLCRI FRYENGCAYFHEEERE 
GLAKI CRLAIHSRYEDFVVDGFNVLYNKKPVIYLSAAARPGLGQ 
YLCNQLGbPFPCLCRVPCNTVFGSQHQMDVAFLEKLIKDDIERG 
RLPIXLVANAGTAAVGHTDKIGRLKEbCEQYGIWLHVEGVNIAT 
LALGYVSSSVLAAAKCDSMTMTP<3PWIjGLPAVPAVTLYKHDDPA 
LTLVAGbTSNKPTDKLRALPbWLSLQYLGLDGFVER I KHACQbS 
QRLQESbKKVNYIKII*VEDELSSPVWFRFFQELPGSDPVFKAV 
P VPKMTP SGVGRERHSCDALNRWLG EQbKQLVPASGLT VMDLEA 
EGTCLRFS PLMTAAVIjGTRGEDVDQIjVAC I ES KLPVLCCTLQLR 
EEFKQEVEATAGbLYVDDPNWSGIGWRYEHANDDKSSLKSYPQ 
GENIHAGLLKKLNELESDLTFKIGPEYKSMKSCLYVGMASDNVH 
AAELVET I AATARE I EDNS RLLENMTEWRKGI QEAQVELQ KAS 
EERLLEEG VLRQ I PWGS VLNWFSPVQALQKGRTFNLTAGSLES 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
1 oca t i on 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TEPIYVYKAQGAGVTbPPTPSGSRTKQRLPGQKPFKRSLRGSDA 
LSKU-SSVSHIEDLEKVERLSSGPEQITLEASSTBGHPGAPSPQH 
TDQTEAFQKGVPHPEDDHSQVEGPESLR 


5813 


2936 


699 


HRDGVSGS LERPI/TDRS RTGA FAQQRGKMATAGGG SGAD PG SRG 
IiLRIiL>S F CVLLAGLCRGNS VE R KI Y I PLNKTAPCVRLLNATFQ I 
GCQSSISGDTGVIHWEKEEDLQWVLTDGPNPPYMViLESKHFT 
RDLMEKLKGRTSRIAGIiAVSLTKPS P ASGFS PS VQCPNDGFG VY 
S NS YG PE F AH CRE I OWNS LGNGLAYEDFS F P I FLLEDENETKV I 
KQCYQDHNLSQMGSAPTFPLCAMQLFSHMAWLSFSTAT\CMRRS 
S I QS TFS INP KI VCDP LSDYNVWSMLKP INTTGTLKPDDRVVVA 
ATRLDSRS FFWNV\APGAES AVAS FVTQLAAAEALQKAPDVTTI> 
PRNVMFVFFQGETFDYIGSSRMVYDMEKGKFPVQLENVDSFVEIi 
GQVALRTSIiELWMHTDPVSQKNESVRNQVEDLIiATLEKSGAGVP 
AVILRRPNQSQPbPPSSLQRFLRARNISGWLADHSGAFHNKYY 
QS I YDTAENINVS YPEV/JbEPLKE /ETWNFG* QDTAKALADVATV 
LGRALYEIAGGTNFSDTVQADPQTVTRLLYG\FLIKANNSWFQS 
ILQGRDLRSYLG*RGLFQH\YIAV\SSPTNTIYV/VLQYALANL 
TGTWNLTREQCQDPSKVPSENKDLYEYSWVQGPLHSNETDRLP 
RCVRSTARLARALSPAFELSQWSSTEYSTWTESRWKDIRARIFL 

IAS KELELITLTVGFG IIiI FSL I VTYCINAKADVLFI APREPGA 
VSY 


5814 


8500 


432 


ALKCRPRRVLAIIiVGPVQPDRMAEEGAVAVCVRVRPIjNSREESLi 
GETAQV YWKTHNNVI YP VDGS KS FNFDRVLHGNETPKNVYEA\ 1 
AAP 1 1 DSAIQG YNGTIFA\ YGOT\ASGKTYTMMGS EDHLGVIPQ 
GQFHGHPSQKI * EVFLDREFLUIVS YMEI YNBTITDLLCGTQKM 
KPIillREDVNRWVYVADLTEEWYTSEMALKWITKGEKSRHYGE 
TKMNQRSSRSHT I FRMI LES REKGEP SNCEGS VKVSHLiNIiVDIjA 
GSERAAO/roAAGVRLKEGCWINRSLFILGQVIKKLSDGQVGGFI 
NYRDSKLTRILQNSLGGNPKTRIlCTITPVSFDETIiTALQFAST 
AKYMKNTPYVNEVSTDEAI*l.KRYRKEIMDLiKKQLEEVSLETRAQ 
AMEKDQIiAQLLE E KDLrLQKVQN EK I ENLTRML VTS S S LTLQQ3L 
KAKRKRRVTWCLGKINKMKNSNYADQFNI PTNITTKTHKLS INL 
LRBIDESVCSESDVFSNTLDTLSEIEWNPATKLLNQENIESELN 
SIJIADYDNI*VIJ)YEQLRTEKEEMELKIjKEK^LDEFEAIjERKTK 
KDQEMQLIHEISNLKNLVKHREVYNQDLENELSSKVELLREKEX) 
QIKKLQEY IDSQKLENIKMDl^S YSLES I EDPKOMKQTLFDAET V 
ALDAKRESAFLRSENLEiKEKMKELATTYKQMENDIQLYQSQIjE 
AKKKMQVDLEKELQSAFNEITKbTSLIDGKVPKDLLCNLELEGK 
ITDLQKEI^KEVEEIOEALREEVILLSBLKSLPSEVERLRKEIQD 
KSEELHI ITSEKDKLFSEWHKESRVQGLLEEIGKTKDDLATTQ 
SNYKSTDQE FQN FKTLHMD FEQKYKM VLEENERMNQE I VNLS KE 
AQKFDSSLGALKTELSYKTQELQBKTREVQERLNEMEQbKEQLE 
NRDS PLQTVERE KTI* I TEKLQQTLEE VKTLTQEKDDLKQLQES D 
QIERDQLKSDIHDTVNMNIDTQEQLRNALESLKQHQETINTLKS 
KISEEVSRNLHMEENTGETKDEFQQKMVGIDKKQDLEAKNTQTL 
TADVKDNEI IEQQRKI FSLIQEKWELQQMLESVIAEKEQUCTDI. 
KENIEMTIENQEELRLbGDELKKQQEIVAQEKNHAIKKEGELSR 
TCDRLAEVEEKLKEKSQQLQEKQQQLLNVQEEMS EMQKKINE IE 
NLKNELKNKELTLEHMETERLELAQKLNENYEEVKS ITKERKVT, 
KE LQKS FETERDHL RG Y I RE I EATGLQTKEELKIAH I HLKEHQE 
TIDELRJlSVSEKTAQIINTODLEKSHTKLOEElPVLHEPnTTT t d 
NVKKVSETQETMNEIiELLrEQSTTKDSTTLAR IEMERLRL.NEKF 
QESQEIEI KSLTKERDNLICTIKEALE VKHDQLKEH IRETLAK IQE 
SQSKQEQSLNMKEKDNETTKIVSEMEQFKPKDSALLRIEIEMLG 
LSKRLQESHDEMKSVAKEKDDLQRLQEVLQS ESDQLKENI KE I V 
AKHLETEEELKVAHCCLKEQEETINELR VNLSEKETE IS TIQKQ 
LEAINDKLQNKIQEIYEKEEQLNIKQISEVQEKVNELKQFKEHR 
KAKDSALQSIESKMIiEIiTNRLQESQEEIQIMIKEKEEMKRVQBA 
LQIERDQLKENTKEIVAKMKESQEKEYQFLKMTAVNETQEKMCE 
IEHLKEQFETQKIiNLENIETENIRLTQILHENLEEMRSVTKERD 
DLRSVEETLKVERXQIiKEiJLRETITRDLEKQEELKIVHMHLKEH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H-Histidine, 1=1 sol eu cine, K= Lysine, 
L=Leucine, M=Meth.i onine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W»Tryptophan, Y-Tyrosine, X- Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QETIDKLRG I VS EKTNBISNMQKDUEHSNDALKAQDLKIQEELR 
IAHMHLKEQQETIDKLRGIVSEKTDKI^NMQKDLENSNAKLQEK 
IQELKANEHQLITLKKDVNETQKKVSEMEQLKKQIKDQSLTLSK 
LEIENLNLAQKLHENLEEMKSVMKERDNLRRVEETIiKLERDQLK 
ESLQETKARDIJE I QQELKTARMLS KEKKETVDKLREKISEKT 1 Q 
ISDIQKDLDKSKDELQKKIQELQKKELQI>LRVKEDVNMSHKKIN 
EMEQIaKKQFEPNYLCKCEMDNFQLTKKliHESLEEIRIVAKERDE 
LRRIKESBKMERDQF I ATLREMI ARDRQNHQVKPE KRLLSDGQQ 
HLMESLREKCSRIKELLKRYSEMDDHYECLNRLSLDLEKEIEFH 
RI MKKLKYVLS YVTKI KEEQHECINKFEMDFIDEVEKQKELL IK 
Tnnf noTk^ , r»Trao'D't?r dim vt MriNMnmTPPTTkTCTYF^P^iRPPS TTC 
TE FQQVLSNRKEMTQFLE EWLNTRFD I E KLKNGI QKENDR I CQV 
NNFFNNRIIAIMNESTEFEERSATISKEWEQDIiKSLKEKNEKLF 
KNYQTLKTSIASGAQVNPTTQDNKNPHVTSRATQLTTEKIRELE 
NSLHEAKESAMHKESKI IKMQKELEVTNDI IAKxjQAKVHESNKC 
LE KTKET I QVLQDKVALGAKP YKEE I EDL KM KLGK I DLEKMKNA 
KEFEKEI SATKATVEYQKEVIRLLRENLRRSQQAQDTSVI seht 
DPQPSNKPLTCGGGSGIVQNTKALILKSEHIRLEKEISKLKQQN 
EQh I KQKNELLSNNQHLSNEVKTWKERTLKREAHKQVTCENS P K 
S PKVTGTAS KKKQITPSQCKERNLQDPVP KE SPKSCFFDSRSKS 
LPSPHPVRYFDNS SLGLCP EVQNAGAESVDS QP \GPWARLFQGK 
DVP\ECK LQ 


5315 


23 


1460 


SEI^VMWTVQNRESLGIjLSFPVMITMVCCAHSTNEPSNMSYVKET 
VDRLLKGYDIRIiRPDFGGPPVDVGMRI DVAS IDMVSEVNMDYTL 
TMYFQQSWKDKRLSYSGIPLNLTLDNRVADQLWVPDTYFLNDKK 
SFVHGVTVKNRT4IRLHPDGTVLYGLRITTTAACMMDLRRYPI»DE 
CNCTLEIES YG YTTDDI EFYWNGGEGAVTGVNKI EIjPQFS I VDY 

S WV S FW INY D ASAAR VALG I TTVLTMTT I S TH LR ETL P KIP Y V K 
AIDIYLMGCFVFVFXiALLEYAFVNYIF FGKGPQKKGASKQDQSA 
NEKNKLEMNKVQVDAHGNI I*LSTLE I RNETSGSEVLTS VSDPKA 
TMYS YDSAS IQYRKPLS SRE\A*GRAP3RHGVPSKGR I RRRAS\ 
QLKVKI PDLTDVNSIDKWSRMFFPITFSLFNWYWLYYVH 


5816 


861 


191 


TVYHERQRLELCAVHALNNVLQQQLFSQEAADE I CKRLAPDSRIi 
NPHRSLLGTGNYDVNVIMAALQGl^LAAVWWDRRRPbSQLAJLPQ 
VI^LILNLPSPVSI^IaLSLPLRRRHLR^^ 

K\LRAPEGPGGLRTE\*GPFIAAAIAC^l J CEVLIiVVTKEVEEKG 
SWLRTD 


5817 


851 


118 


RLFRGPGANRGRS CRGCSGGREPSGGA1« P KRHCPC * P PSP PAAD 
VMSNTTV PMAPQANS DSM VG YVLGP FFL I TLVGVWAWM YVQK 
KKRVDRLRHHbLPMYSYDPAEELHEAEQELLSDMGDPKVVNQAG 
RVATSTSGCHCWMSRRDLTPLPHPSEPGVLDCLGPCHLIiPLLSP 
GSPCWVLGLHFSLHPPSAASASHALTITSIjPPGIjLPFVGVELTA 
HPQALMGRGFPSGMAAAGRHLCFL 


5818 


3 


3318 


QAIiRDKLW IFI»VQS F YA VRHTES WKLMSTDDQQK I QAAAFDKGD 
DRRLGKKPI FSS SQQRKQVSDSGDIKI KS WRGNNKKECWSYLST 
NKKMKSIX5LGASGHSSSTNRNS INKTLKQDDVKEKDGTKIASKI 
TKELKTGGKNVfiGKPKTVTKSKTENGDKARliENMS PRQWERSA 
TAAAAATG Q KNLLNG KG VRNQEGQ I SGARP KVLTGNLNVQ AKAK 
PUCKATGKDSPCLSIAGPSSRSTDSSMEFSISTECLDEPKENGS 
TEEEKPSGHKLS FCDS PGQMMKNS VDSVKNSTVAI KSRPVSR VT 
KGTSNKKS I HEQDTNVNNSVLKKVSGKG CS EP VPQAI LKKRGTS 
NGCTAAQQRTKSTP SNLTKTQGS QGES PNS VKSS VSSRQSDENV 
AKLDHNTTTEKQAP KRKMVKQVHTALPKWAKIVAMPKNLNQSK 
KGETLNNKDS KQ KM P PGQVI S KTQPSSQRPIiKHETSTVQKSMFH 
DVRDNNNKDSVSEQKPHKPLINIiASEISDABALQSSCRP\DPQK 
PLNDQEKEKIiAIiECQNI SKLDKSIiKHELESKQI CbDKSETKFPN 
HKETDDCDAANI CCHS VGSDNVNS KF YS TTALKYMVSNPNENS Ii 
NSNPVCDLDSTSAGQIHLISDRENQVGRKDTNKQSSIKCVEDVS 
LCNPERTNGTLNSAQEDKKSKVPVEGLTI PS KI»SDES AMDEDKH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide ' 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q-Glutamine, R=Airginine, 
S=Serine, T=Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSDVSSKCFSGQLSEKNSPKNMETSESPESHETPETPFVGH 
WNLSTGVLHQRESPESDTGSATTSSDDIKPRSEDYDAGGSQDDD 
GSNDRGISKCGTMLCHDFLGRSSSDTSTPEELKIYDSNLRIEVK 
MK KQSSNDL FQ VNS TS DDE I PRKR P EI WS RSA I VH <5 R FT? FM T P w 
GSVQFAQEIDQVSSSADETEDERSEAENVAENFSISNPAPQQFQ 
G I INUAFEDATENECREFSAN KKF KRSVLLSVDECEELGSDEGE 
VHTPFQASVDSFSPSDVFDGISHEHHGRTCYSRFSRESEDNILE 
CKQNKGNS VCKNESTVLDIiSS I DS SRXNKQSVSATEKKNTI DVL 
SSRSRQLLREDKKVNNGSNVENDIQQRSKFJLDSDVKSQERPCHL 
DLHQREPNSDIPKNSSTKSLDSFRSQVLPQEGPVKESHSTTTEK 
ANIALSAGDIDDCDTLAQTRM YDHR P SKTLSPI YEMDVI EAFEQ 
KVES3THVTDMDF*DDQHFAKQDWTIjLKQLLSEQDSNLDVTNSV 
P EDLS LAQ YL I NQTLLLARDS S KPQG ITH I DTLNRWS ELTS PLD 
SSAS ITMAS FSS EDCSPQGEWTILELETQH 


5819 


l 


5557 


AAAGLLGAbHLVMTLWAAARAEKEAFVQSES I IEVLRFDDGGL 
LQTETTLGLSSYQQKSISLiYRGNCRPIRFEPPMLDFHEQPVGMP 
KME KVYXHN P SS E * T I TLVS I FATTS HFHAS FFQNRK I L PGGN T 
SFDVS/VFIiARWGNVENTLFINTSNHGVFTY\QVFGVGVPNPY 
RLR P FLGARVTVNSS FS P 1 1 N I HNP HS E PLQ WEM YS SGGDJUH L 
ELPTGQQGGTRKLWEI PP YETKG VMRASFSSREADNHTAFIRI K 
TNASDSTEFIILPVEVEVTTAPGIYSSTEMLDFGTLRTQDLPKV 
LN t»HLbNSGT KD VP I TS VRPTPQ \ NDAITVHFKP I TLKAS \ ES K 
YTKVAS I S PDAS KAK KPS Q FSGK I TVKAKE KS YS KI>E I P YQAEV 
LDG YLGFDHAATLFHI RDS PADPVER P I YIjTNTFSFAILIHDVL 
LPEEAKTMFKVHNFSKPVLIIiPNESGYIFTLLFMPSTSSMHIDN 
NI LLITNASKFHLPVRVYTGFLDYFVLPPKI EERF ID FGVLSAT 
EASNI LFAI INSNPI EIiAI KS WH I IGDG\LS I ELVAVDRGNRTT 
IISSLPECEKSSSSDQSSVTLASGYF\AVFRVKLTAKKL\EGIH 
DGAIQITTDYEILTIPVK\AVIAVGSLTCSPKHVVLPPSFPGKI 
VHQSLN I MNS FS Q KVKI QQ I RSIiS EDVRFYYKRLRGNKEDLE PG 
KKS KI ANI Y FD PGLQCGDHCYVGL P FI* S KSE P KVQ PG VAMQ EDM 
WDADWDIiHQSIiFKGWTGI KENSGHRiSAI FEVNTDL.QKNI I SKI 
TABLSWPS ILSS PRHLKFPLTNTNCSS \ EEE ITLENP/SQDVPV 
YVQFI PI^YSNPSVFVDKLVSRFNLSKVAKIDLRTLEFQVFRN 
SAHPLQSSTGFMEGXliSPHIiIIjNLI LKPGEKKSVKVK\ FTP VHN 
RTVSSLI IVRIWLTVMDAVMVQGCGTTENLRVAGKL.PGPGSSLR 
FKI TEALLKDCTDSLKIiR E PNFTLKRTFKVENTGQLQ I H I ETIE 
I SG YS CEG YG FKWNCQE FTL S ANASRD 1 1 ILFT PDFTAS R VI R 
ELKFITTSGSEFVFILNASLPYJIMIATCAEALPRPIWELAIjYI I 
ISGIMSALFLLVIGTA\YLEAQGIWBP\FRRRLS\FEASNPPFD 
VGRPFDLRRIVGISSEGNLNTLSCDPGHSRGFCGAGGSSSRPSA 
GSHKQ * G P S GH PHS S HSNRNS ADVDDVRAYNSGRTSS MTS AQAA 
SSQPANKTRPIiVLDSNTGAQGHSAGRKSKGAKQSQHGSQHHAHS 
PLEQHPQPPLPPPVPQPQEPQPERLSPAPIAHPSHPERASSARH 
SSEDSDITS LIEAMDKDFDHHDS PALBVFTEQPPS PLPKSKGKG 
KPI^RICVKPPKKQEEKEKKGKGKPQEDELKDSLADDDSSSTTTE 
TSNPDTE PLLKEDTE KOKG KOAMP EKFES EM<?f>VK01C <5 kkt.t m t 

KKEIPTDVKPSSLELPYTPPLESKQRRNLPSKIPLPTAMTSGSK 
SRNAQKTKGTSKLVDNRPP ALAKFLPKSQELGNTS S S EGEKDS P 
PPEWDS VPVHKPGSSTDSIi YKLSLQTLNAD I FLKQRQTS PTPAS 
PSPPAAPCPFVARGSYSSIVNSSSSSDPKIKQPNGSKHKLTKAA 
SIiPGKKGNPTFAAVTAG YDKS PGGNGFAKVSSNKTGFSSSLGI S 
HAP VDSDGSDSSGLWS PVSNPSS PDFTPLNS FSAFGNSFNLTGE 
VFSKLGLSRSCNQASQRSWNEFWSGPSYLWESPATDPSPSWPAS 
SGS PTHTATS VLGNTS GLWSTTP FS SS I WSSNLSSALP FTTPAN 
TLAS IGLMGTENS PAPHAPSTSS PADDLGQTYNPWRI WSPTIGR 
RSSDPWSNSHFPHEN 


5820 | 


310 


1270 


RVSI^GPVSJ^VLL<^UlSSTMG)a^DI^VAYMNPIAMARSRGPIQ 
SSGPTIQ\ VI * IDQGLPGKK* KSN* KRKRK/DSKALAEFEEKMN 
EJWKKELEIO^EKLLSGSESSSKKRQRKKKEKKKSW*\DSSSS\ 
SSSSDSSSSSSDSEDEDKKC^KRRKKKKNRSHXSSESSMSETBS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=> 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
K=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Aspar agine , 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine , V=Valine, 
W= Tryptophan , Y=Tyrosine, X=Unknown, *=Stop 
Codon, /—possible nucleotide deletion, 
\=possible nucleotide insertion) 








DS KDSLKKKKKS KDGTEKE KDI KGLS KKRKMYSEDKPLS S ESL S 
ESEY IEEVRAKKKKS SEEREKATEKTKXKKKHKKHSKKKKKKAA 
SS S PDS P * H * EKSGFP Y KES AMSEE I S TVKTTTYIjLKCMNFLiVF 
Gil PGLFSSHSDATV 


5B21 


173 


915 


KWRKQSWRWPKPGTNWM1.SCSVCWRRVTWTGSVWMRKJU5KHPQT 
PT/ 1 KDCSIAATGKRPSARFPHQRRKKRREMDDGLAEGGPQRSN 
TYVI KLFDRS VDLAQ FS ENTPLYP I CRAWMRNS P S VRERECS PS 
SPLPPLPEDEEG\ SEVTNSKSR* CVQACPPTHTPGGQPKNACR\ 
SRI PSPLAAIiRMQGTP * RWSPFEPEPS PSTLI YRNMQRWKRI RQ 
RWKEASHRNQLRYSESMKILREMYERQ 


5822 


464 


4373 


QTLKEMPIVMARDLEETASSSEDEEVISQEDHPCIMWTGGCRRI 
PVI»VFHADA I LTKDNN I RV IG ER YHLS YKI VRTDS RLVRS I I/TA 
HGFHEVHPSSTDYNLMWTGSHLKPFLLRTLSEAQKVNHFPRSYE 
LTRKDRLYKNI IRMQHTHGFKAFH ILPQTFLLPAE YAEFCNS YS 
KDRGPWIVKPVASSRGRG\VYIjINNPNQISIjEENILVSRYINNP 
OL I DD FXFDVRLYVIiVTS YDPIiV I YLYEEGLARFATVR YDQGAK 
NIRNQFMHLTNYSVNKKSGDYVSCDDPSVEDYGNKWSMSAMLRY 
DKQEGRDTTALMAHVEDLI I KTI ISAELAIATACKTFVPHRSSC 
FELYGFDVLIDSTLKPWLLEVNL»SPSlACDAPI»DIiKIKASMISD 
MFTWGFVCQDPAQRASTRPIYPTFESSRRNPFQKPQRCRPLSA 
S DAEM KNLVGSAREKGPGKLGGS VLG LSMEE I KVI»RR VKEENDR 
RGGFI R I FPTSET WE I YGS YLEHKTS MNYMLATRLFQDRMTADG 
APELKI * SLNSKAXLHAALYERKLLSLEVRKRRRRSSRLRAMRP 
KYPV I TQPAEMNVKTETESEEEEE VALDNEDEEQE AS QEE SAGF 
IiRENQAKYTPSLTALVENTPKENSMKVREWNNKGGHCCKLETQE 
LEPKFNIiMQI LQDNGNLS KMQAR I AFSAYLQHVQ I \RLMKDSGG 
QTFSASWAAKEDEQMELWRFLKRASNNLQHSLRMVLPSRRLAL 
LERTRILAHQIiGDFIIVYNKETEQMAEKKSKKKVEEEEEDGVNM 
ENFQEFI RQASEAELEEVLTF YTQKNKS ASVFLGTHS KIS KNNN 
NYSDSGAKGDHPETIMEEVKIKPPKQQQTTEIHSDKLSRFTTSA 
EKEAKLVYSNSSSGPTATLQKI PNTHLSS VTTSDIjS PGPCHHSS 
LSQIPSAIPSMPHQPTILLNTVSASASPCLHPGAQNIPSPTGfcP 
RCRSGSHTIGPFSSFQSAAHIYSQKLSRPSSAKAGSCYIiNKHHS 
G I AKTQ KEGEDAS LYS KRYNQSMVTAELQRLAEKQAARQYS PSS 
HINIiLTQQVTNLNLATGI INRSSASAPPTLRPI I S PSGPTWSTQ 
SDPOJ^PENHSSSPGSRSLQTGGFAWEGEVENNVYSQATGVVPQH 
KYHPTAGSYQLQFALQQLEQQKLQSRQLLDQSRARHQAIFGSQT 
LPNSNLWTMNNGAGCR ISS ATASGQKPTTI*PQKVVP P PSSCAS I* 
VPKPPPNHEQVLRRATSQKASKGSSAEGQLNGLQSSLNPAAFVP 
ITSSTDPAHTKIMNHKHTEKQPVHHSWVHD 


5823 


42 


2293 


IiliTALSMEGGGGRDEPSACRAGDVNMDDPKKEDILLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLEUWPVPEQPPLP 
TSESPFAWSPLAGEKF^EVYKEAHIXALHIESSSRNQAAQAAICP 
EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DS PLLGPPVGEPRMASSPALPS SGAQARLTRAPG P PHSAHALP 
RES CTAHAAS QAATQRKPGTKLLLPRAAS VRGRGI PGAAEKPKK 
EIPASPSRTKIPAEKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 
RAIPVP\NKIiGLKKTLLKAPGSYSN\LQRKSSSGA\VWSGASSA 
CTPQP VAKAKS SEFAS I PAN * LPGLCPNI SKS \GRMGPAML«RPA 

GGG\O^IjNSSCAWSESSQimTRSIRRRDSCIJISKTKVMPTPTN 
QFKI PKFS IGDS\ P DS STPKLS RAQRPQS CTSVGRVTVHSTPVR 
RSSGPAPQSIiLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\ CVPARRRS SE PRKNS AMRTE PTRESNRKTDSR\ LVDVSPDR 
GS PPSRVPQALNFS PEES DSTFS KSTATEVAREEAKPGGDAAPS 
E ALLVD I KLE PLAVT PDAASQPL I DLPL I DFCDTPEAHVAVGSB 
SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPI.IQI5PEADK 
ENVDS PLLKF 


5824 


42 


2293 


LLTALSMEGGGGRDBPS ACRAGDVNMDDPKKEDI LLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TSES PFAWS PLAGEKFVE VYKEAHLLALH I ES S SRNQAAQ AAKP 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepti5e~~ ' 
(A=Alanine, C=Cysfceine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5825 






EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DSPLLGPPVGEPRLLASSPALPSSGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAAS VRGRGI PGAAEKPKK 
EIPASPSRTKIPAEKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 
kax f vf \WMjtrljKJCTIjIjKAPGS YSN\IiQRKSSSGA\VWSGASSA 
CTPQPVAKAKSS EFAS I PAN * h PGLCPN I S KS \GRMG P AMLRPA 
L\ PAGFVG\ ASSWQAKRVDVSEIoAAEQLTAPP \SAS PTQPQTP E 
GGG \ QWLNSS CAWS ES SQLNKTRS IRRRDS CLNS KTKVMPT PTN 
QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
EALLVDI KLEPLAVTPJDAASQPLI DLPL IDFCDTPEAHVAVGS E 
SRPLIDLMTNTPDMNKNVAKPS PWGQLI DLS S PLIQLS PEADK 
ENVDSPLLKF 


5826 


2 


4210 


FLQIESASPAPFSSGFLAAHPHSPGGSIATKGRSRLSAPGMLHL 
SAAPPAPPPEVTATARPCLCSVGRRGCGGKMAAAGALERSFVEL 
SGAER ER P RH FREFT VCS IGTANAVAGAVKYS ESAGGF Y YVES G 
KLFSVTRNRFIHWKTSGDTLELMEESLDINI>LNNAIRLKFQNCS 
VLPGGVYVSETQNRVIILMLTNQTVHRLLLPHPSRMYRSELWD 
SQMQS IFTDIGXVDFTDPCNYQIiI PAVPG I SPNSTASTAWLSSD 
GEAI.FALPCASGGIFVLKLPPYDIPGMVSWELKQSSVMQRLLT 
GWMPTAIRGDQSPSDRPLS LAVHCVEHDAF I FALCQDHKLRMWS 
YKEQMCLMVADMLEYVPVKKDLRLTAGTGHKLRLAYSPTMGLYL 
GIF\MHAPKRGQFClFQl»VSTESNRYSIiDHISSIjFTSQETLtIDF 
ALTS TD I WALWHDAEN QTWKY INFEHNVAGQWNPVFMQ PLPEE 
E I VI RDDQD PREM YLQSIjFTPGQFTNEALCKAliQI FCRGTERNL 
DLSV7SELKKEVTliAVENELQGSVTEYEFSQEEFRNLQQEFWCKP 
YACCLQ YQEAI»SH PliALHLN PHTNMVCLiLKKG YLS FI*I PSSLVD 
HLYLLP YENLLTEDETTI SDDVDI ARDVICLI KCLRL I EES VTV 
DMSVIMEMSCYWLQSPEKAAEQILE1>MITIDVENVMEDICSKLQ 
ElRNPIHAIGLLIREMDYETEVEMEKGFNPAQPLNIRMNLTQtiY 
GSNTAG YI VCRGVHKI ASTR FLI CRDLLI LQQLLMRLGDAV I WG 
TGQLFQACXJDLI^TAPLLLSYYLIKWGSECLATDVPLDTLESN 
I£HLS VLELTDSGALMANR FVSS PQT IVEI»F FQ EVARKH IIS HL 
FSQPKAPLSQTGIiKWPEM I TAITS YLLQIiLWPSNPGCLFIjECLM 
GNCQYVQLQDYIQIiI^PWCQVNVGSCRFMLGRCYIjVTGEGQKAIi 
ECFCQAASEVGKEEFIiDRLIRSEDGEIVSTPRLjQYYDKVLRLiIJD 
VIGLPELVIQIiATS AI TEASDDW\ KS QATX \ RTCI FKHHI»\ DLG 
\HN£ QAYGSL * PQ I PDS SRQLDCLRQLVWLCERS QLQDLVE FS 
YVNI^EVVGI IESRARAVDLMTHNYYELLYAFHI YRHN YRKAG 
TVMFE YGMRLGRE VRTLRGLE KQGNC YIiAALKCIjRLI R P E YAWI 
VQPVSGAV YDR PGAS PKRNHDGECTAAP TNRQ I E I LELEDLEKE 
v-^ixn-n. j. x ijflyttu^bA v AVAC» £>SSAEEMVTLIjVQAGLFDTAIS 
IjCQTFK1iPLTP1^EGIAFKCIKILOFGGEAAQAEAWAWLAANQI>S 
S VI TTKESSATDEAWRLLS T YLERYKVQNNL YHHCVI NKLLSHG 
VPLPNWL INS YKKVOAAELLRLYLNYDLLDLTP YQVIRICGC 


5827 


3 


871 


KSQLLRDHSAPPPKPCTSVGAMGC+PRQ/SPKEQQRQLKKQKNR 

AAAQRS RQKHTDKADAEiHQQHES LE KDNLALRKE I QS LQAELAW 

WSRTLHVHERLCPMDCASCSAPGIiLGCWDQAEGLLGPGPQGOHG ' 

CREQLELFQTPGSCYPAQPLSPGPQPHDSPSLLQCPLPSLSLGP 

AVVAEPPVQLSPSPLLFASHTGSSrjCXSSSSKLSALQPSLTAQTA 

PPQPLELEHPTRGKLGSSPDNPSSALGIARLQSREHKPALSAAT 

WQGLWDPSPHPLLAFPLLSSAQVHF 




194 


2287 


GMGSENSALKSYTLREPPFTLPSGLAVYPAVLQDGKFASVFVYK 
RENEDKVNKAAKVP* *HLKTLRHPCLLRFLSCTVEADGIHLVTE 
RVQPIiEVAIiETLSSAEVCAGI YDILLALI FLHDRGHLTHNN VCI> 
S S VF VS EDGHW KLGGMETVCKVSQATPE FLRS IQS I RDPAS I P P 
E EMS PE FTTLPECHGHARDAFS FGTLVES LLTI LNEQVS ADVLS 
S FQQTLHSTLLNP I PKWRPALCTLLSHDFFRNDFLEWNFLKSL 
TLKSEEE KTE FFKFLLDRVS CLS EELIAS RLVPLLLNQLVFAE P 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

co rre spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
VJ= Tryptophan, Y=» Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VAV\ KSFLP Y LLG PKKDHAQGET PCLLS PAL FQ5RV ± PVLLQL F 
E VH EEHVRM VLLSH I KAY VGALS LREQLKKV\ I L\ PQVLLG \ LR 
D\TSDSIVAITLHSLAVLVSLLGPEVWGGERTKIFKRTAP\SF 
TK\NTDLS LEGDPFSQP I KFP INGLSDVKNTSEDS ENFPSSS KK 
SEEWPDWSGPE\EPENQTVNI\QIWP\REP\CDDVKSQCTTLDV 
EESSWDDCEPSSLDTKVNPGGG ITATKPVTSGEQKP I PALLS LT 
EESMPWKS SLPQKI SLVQRGDDADQIE PPKVSSQERPLKVPSEL 
GLGEEFTIQVKKKPVKDPEMDVJFADMI PB I KPSAAFLILPELRT 
EMVPKKDDVSPVMQFSS KFAAAE I TEGEAEGWEEEGELNWEDNN 
W 


5828 


2 


257 


AREGGSLGAVAACGELSYSCDFCPARPHTSWjTRFVKMEFQAW 
MAVGGGSRMTDLTSS I PKPLLP VGNKF L I W Y PLNLLERVGFEE V 
IWTTRDVQKALCAEFKMKMKPDIVCIPDDADMGTADSLRYIYP 
KLKTDVLVLS CDLI TDVALHE WDL FRAYDAS LAMLMRKGQDS I 
EPVPGQKGKKKAVEQRDFIGVDSTGKRLLFMANEADLDEELVIK 
GSILQKHPRIRFHTGLVDAHLYCLKKYIVDFLMENG\SITSIRS 
EL\ I PYLV /RGKQFSSASSQQGTRKEKEGGSKGKRGLKSFRIS Y 
S FY* KE ANYTGTG AP Y \ D\ ACW I 


5829 


260 


1259 


PDGRH VSCSEDKT I KI WDTTNKQCVNNFSDSVGFANFVDFNPS 
GTCI AS AGS DQTVKVWD VRVNKLLQHYQVHSGGVNC I S FH PSGN 
YLI TASSDGTLKI LDLLKGRLI YTLQGHTGPVFTVSFSKGGELF 
ASGGADTQVLLWRTNFD ELHCKGLTKRNLKRLHFDS P PHLLDI Y 
PRTPHPHEEKVETVEDFFLHLLRL I QSLR* S I CRSLLPLLWISF 
LLI LPQQQKPWGLCQTRVKRPVDIS *TLP * CHQNVCQQPRKRK 
QKT * VTSPVKVK / VS I PLAVTDALEH I MEQLNVLTQTVS I L EQR 
LTLTED FCLKD CLENQQ KLFS AVQQKS 


5830 


4496 


3139 


GGKMAAPEERDLTQEQTEKLLQFQDLTGIESMDQCRHTLEQHNW 
NIEAAVQDRLNEQEGVPSVFTJPPPSRPLQVN7ADHRIYSYWSR 
PQPRGLLGWGYYLIMLPFRFTYYTILDIFRFALRFIRPDPRSRV 
TDPVGDIVS FMHS FEE KYGRAH PVFY QGTY SQ ALNDAKRELRFL 
LVYLHGDDHQDSDEFCRNTLCAPEVISLINTRMLFWACSTNKPE 
GYRVSQALRENTYPFLAMIMLKDRRE* PV\ VGRLEGLI \QPDDL 
INQLTF I MD ANQT YL VS E R LERE ERNQTQVLRQQQDEAYLAS LR 
ADQEKERKKREERERKRRKKEEVQQQKLAEERRRQNLQEEKERK 
LECLPPEPSPDDPESVKI I FKLPNDSRVERRFHFSQSLTVIHDF 
LFS LKES P\EKFQ I EA\NFPRR \ VLPCI PS EE\ WPNPPTLQE\ A 
GLSHTEVLFVQDLTDE 


5831 


71 


2897 


FCS KDKCCL YLPDS INRS KS CTAKPGAHSQDRHAVMDS ERQVKD 
TDD I ES P KRS I RDSGY I DCWDSERSDS LSP PRHGRDDS FDS LDS 
FGSRSRQTPS PD WLRGS SDGRGSDSESDLPHRKL PDVKKDDMS 
ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKAEREEYR 
KSWSTATSPAGLGKKALQDYGPRT\PVS\DDAESTSMFDMRC3E 
E AAVQPHS RARQEQLQL INNQLREEDDKWQDDLARWKSRKRS VS 
QDLI KKEEERKKMEKLLAGEDGTSERRKSI KTYRE IVQEKERRE 
RELHEAYKNARSQEEAEGILQQYIERFTISEAVLERLEMPKILE 
RSHSTEPNLSSFLNDPKPMKYLRQQSLPPPKFTATVETTIARAS 
VLDTSMS AGSGS PS KTVTPKAVPMLTPKPYSQPKNS QDVLKTFK 
VDGKVSVNGETVHREEEKERECPTVAPAHSLTKSQMFEGVARVII 
GSPLELKQDNGS I E INI KKPNSVPQELAATTEKTEPNSQEDKND 
GGKSRKGNIELASSEPQHFTTTVTRCSPTVAFVEFPSSPQLKND 
VSEEKDQKKPENEMSGKVELVLSQKWKPKSPEPEATLTFPFLD 
KMPEANQLHLPNLNSQVDSPSSEKSPVTTPFKFWAWDPEEERRR 
QEKWQQEQERLLQERYQ\KEQDK\LKEE\WEKAQKEVEEEERRY 
YEEEP*II\EDPWPFTVSSSSADQLSTSSSMTEGSGTMNKIDL 
GNCQDEKQDRRWKKSFQGDDSDLLLKTRESDRLEEKGSLTEGAL 
AHSGNPVSKGWEDHQLDTEAGAPHCGTNPQLAQDPSQNQQTSN 
PTHSS EDVKPKTLPLDKS INHQ I ES PSERRKSI SGKKLCSS CGL 
PLGKGAAMI IETLJ5LYFH IQ CFRCG\ ICKGQLGDAVSGTDVRIR 
NGLLNCNDCYMRSRSAGQPTTL 


5832 


2454 


829 


PGRRFRHGS CAFQKQC I MLH I CQ YFLQGECKFGTS CKRS HDFSN 
SENLEKLEKLGMSSDLVSRLPTIYRNAHDIKNKSSAPSRVPPLF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

| aroino acid 

| sequence 


Amino acid segment containing signal peptide — 
(A=Alanine, CsCysteine, D=Aspartic Acid, E*= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W~Tryptophan, Y=»Tyrosine, X=Unknown, *=Stop 
Codon, /=po3sible nucleotide deletion, 
\=possible nucleotide insertion) 








VPQGTSERKDS 3 GS VS PNTLSQE EGDQ I CLYH I RKS CS FQ D KCH 
R VH FH LP Y KWQ FUDRG KWEDLDNMELI EEAYCNPKI E R I LCSE S 
ASTTOSHCI^F'NAMTYGATQARRLSTASSVTKPPHFILTTDWIW 
YWSDEFGS VJQE YGRQGTVHPVTTVSSSDVEKAYLAY/W YTG V* R 
PGSHLEVPGRKAQLRVR FQS LRS EKPGLWHN* KGLPQTQ I R \ AP 
QDVTTMQTCNTXPPGPKS I PD YWDS SALPDPG FQ KI TLS SS SEE 
YQKVWNLFNRT LP FY FVQKI ER VQNLALWEVYQWQKGQMQKQNG 
GKAVPERQIiFHGTSAI FVDAI CQQNFDWRVCGVHGTS YG KGS YF 
ARDAAYSHH YS KSDTQTHTM FLARVLVGE F VRGNAS FVR P PAKE 

GWSKAFYDSCVNSVSDPSIFVIFEKHQVYPEYVIQYTTSSKPSV 
TPS I LLALGS LFSSRQ 


5833 


170 


1 3289 


S ILCLLS PC WQFGKP WS ILSSRSRHSPCTKKGWEGMRKHLHT 
RQGHK* VHVE I S KALW VYRDD Y F I RHS I S VS AVI VRAW I THK YR 
GRDWNVKWEENLLHAVAKNYTLLQTI PPFERPFKDHQVCLEWNM 
GYI WNLRANRI PQCPLEND WALLGFPYASSGENTGI VKKFPRF 
RNRELEATRRQRMDYPVFTVSljWLYLLHYaCANLCGILYFVDSN 
EM YGTPSVFI/TEEGYLHIQMHLVKGEDIAVKTKFI I PLKEWFRL 
DI SFNGGQI WTTS IGQDLKS YHNQTISFREDFH YNDTAGYFX I 
GGSRYVAG I EGFFGPLK YYRIiRSLHPAQI FNPLLEKQLAEQI KL 
YYERCABVQE I VS VYASAAKHGGERQEACHLHNSYIiDLQRR YGR 
PSMCRAFPWE KELKDKH PS LFQALLEMDLLTVPRNQNESVS EIG 
GKI FEKAVKRLS S IDGLHQISS I VPFX»TDSSCCGYHKAS YYLAV 
FYETGLNVPRDQLQGMLYSLVGGQGSERLSSMNLGYKHYQGIDN 
YPLDWELSYAYYSNIATKTPLDQHTLQGDQAYVETIRT.KDDEIL 
KVQTKEDGDVFM WLKHEATRGNAAAQQRLAQML FWGQQGVAKNP 
EAAIEWYAKGALETEDPALIYDYAlVI»FKGQGVKKNRRIiALELM 
KKAASKGLHQAVNGLGWYYHKFKKNYA\KAAKYWLKA\EE\MGN 
PDAS YNLGVLHLDG I FPG VPGRNQTLAGE YFHKAAQGGHMEGTIi 
WCSLYYITGNLErFPRDPEKAWWAKHVAEKNGYLGHVIRKGI^f 
AYI^GSWHEAIiliYYVIiAAETGIEVSQTNLAHICEERPDIiARRYL 
GVNCVWRYYNFS V FQ I DAPS FAYIiKMGDLYYYGHQNQSQDLEIiS 
VQMYAQAALDGDSQGFFNLALLI EEGTI IPHHILDFLE I DSTI>H 
SNNISILQELYERCWSHSNEESFSPCSLAWLYLHLRLLWGAILH 
SAL I YFLGTFIiLS I L XAWTVQ YFQS VS ASDPPPRPSQAS PDTAT 
STAS PAVTPAADAS DQDQ PTVTNN PEPRG 


5834 


17 j 


4020 


RFRRGGGRVFPGAFPASPSDSIJGQGNSQGPPRTPKPPRT/'QECG 
SAAPGPIPGQSSS *VPI*RLEQIQQKADCPLSLELALKPRMAAQV 
TLEDALSNVDLLEELPIiPDQQPCIEPPPSSLLYQPNFNTNFEDR 
NAFVTG I ARYI EQATVHSSMNEMLEEGQEYAVMLYTWRSCSRAI 
PQVKCNEQPNR VE I YEKTVEVLEPEVTKLMNFMYFQRNAI ERFC 
GFATRRLCHAERRKDFVSEAYLITLGKFINMFAVLDELKNMKCSV 
KNDHSAYKRAAQFI.RKMADPQS XQES QNLSMFLANHNKI TQS LQ 
CX3LBVISGYEELLADIVNLCVDYYENRMYLTFSEKHMLLKVMGF 
GLYLMDGS VSNIYKLDAKKR INLS KI DKYFKQLQWPLFGDMQ I 
ELARY I KTSAH YEENKS RWTCTSSGS S PQYNI CEQM I QIREDHM 
RFI SELAR YSNSE WTGSGRQEAQKTDAE YR KL FD1*AIjQGL0L L 
SQWSAHVMEVYSWKLVHPTDKYSNKDCPDSAEEYERATRYNYTS 
EEKFALVEVIAMIKGLQVLMGRMESVFNHAIRHTVYAALQDFSQ 
VTLMEPLRQAI KKKKNVI QS VLQA I RKTVCDWETGHEPFNDPAlj 
RGEKDPKSG*DIKVPRRAVGPSSTQLYI4VRTMLESI,IADKSGSK 
KTLRSSLEGPTILD I EKFHRES FFYTIILINFSETLQQCCDLSQL 
WFREFFLELTMGRRIQFPIEMSMPWILTDHILETKEASMMEYVI, 
YSUJLYITOSAHYALTRFNKQFLYDEIEAEVNLCFDQFVYKIjADQ 
IFAYYKVMAGSLLLDKRLRSECKNQGATIHLPPSNRYETLLKQR 
HVQLLGRSrDLNRLITQRVSAAMYKSLELAIGRFESEDLTS I VE 
LDGU,E INRMTHKLLSRYLTLDGFDAMFREANHNVSAPYGR ITL 
HWWELNYDFLPNYCYNGSTNRFVRTVLPFSQEFQRDKQPNAQP 
QYLHGSKALNIAYSS I YGS YRNFVGPPHFQ VI CRLLG YQG IAW 
MEELLKWKSLI^TII^YVKTLMEVMPKICRLPRHEYGSPGII, 
EFFHHQIiKDIVEYAELKTVCFQNIjREVGNAILFCLLIEQSIiSLE 
EVCDLIiHAAPFQNILPRVHVKEGERI^AKMKRLESKYAPLHLVP 
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beginning 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to "first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P=Proline, O-Glut amine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Un known, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LI ERLGTPQQ I AI AREGDLLTKERLCCGLiSMFEVI LTR IRS FLD 
DP I WRG ?LP SNGVMHVDE CVE FHRLWSAMQ FVYC I PVSTHEFTV 
EQCFGDGLHW AG CM 1 1 VLLGQQR RFAVLDFC YHLLKVQKKDGKD 
EI I KNVPLKKMVERIRKFQI LNDE I IT ILDKYkKSGDGEGTPVE 
HVRCFQPPIHQSLASS 


5835 


4209 


1904 


SGN I RMAQGSHQ I DFQVLHDLRGK F PEVPEVWSRCMLQKNNNL 
DACCAVLSQESTRYLYGEGDliNFSDDSGISGLRNHMTSLKLDLQ 
SQN I YHHGREG SRMNGSRTLTHS I SDGQLQGGQSNS E LFQQE PQ 
TAPAQVPQG FN V FGMSS S SGASNSAPHLGFHLGSKGTSSLSQQT 
PRFNP I MVTLAPNI GTGRNTPTSLHIHGVPPPVLNSPQGKS I YI 
RP Y I TT PGGTTRQTQQHSGW VSQFNPMN PQQ VYQPSQPG P WTTC 
PASNPLSHTSSQQPNQQGHQTSHVYMPISSPTTSQPPTIHSSGS 
SQSSAHSQYNIQNISTGPRKNQIEIKLEPPQRNNSSKLRSSGPR 
TSSTS S S VNSQTIiNRNQPT V Y I AAS P PNTDELMS RSQ PKV Y I S A 
NAATGDEQVMRNQPTLF I STKTSGASAASRNMSGQVSMGPAFIHH 
H P PKS RAI GNN SATS PR VWTQPNT\ EYTFK ITVS PNKP P AVS P 
G WS PTFE LTNLLNHPDH YVETEN IHHLTDPTLAHVDRI S ETRK 
LSMGS DDAAYTQD I *RISNS WLGMVAHACNSSALGGQDGRI I +A 
QEFETSWGNIWRIiRIjYRRF*NYAGMVAHTCSPSYSVD*AIjI,VHQ 
KARM ERIjQRELE 1 QKKKLDKLKSEVNEMENNLTRRRLKRSNS I S 
QIPSLEEMQQLRSCNRQLQIDIDCLTKEIDLFQARGPHFNPSAI 
HNFYDNIGFVGPVPPKPKDQRSIIKTPKTQDTEDDEGAQWNCTA 
CTFLNHPALIRCEQCEMPRHF 


5836 


361 


2303 


FHITMCGICCSVNFSAEHFSQDLKEDIiLYNLKQRGPNSSKQLLK 
SDVNYQCIjFSAHVLHIiRGVLTTQPVEDERGNVFLWNGEIFSGIK 
V2AEENDTQILFNYLSSCKNESEILSLFSEVQGPWSFIYYQASS 
HYLWFGRDFFGRRSIiLWHFSNLGKSFCLSSVGTQTSGLANQWQE 
VPAS\DFSELILSLLSFPDAI,FYWCILGNIFLGRILLKKMLIA* 
VXFQQTYQHLYQR* QMKPNC I LKNLLFL * I *CCHKLHWRIiI AVI 
FPMCHLQER YFKS FLLMYT * KEVIQQFI DVLSVAVKKRVLCLPR 
DENIiTANE VLKTCDRKANVAI LFSGG I DS MVI ATLADRHI P LDE 
P IDLLNVAFIAE E KTMPTTFNR3GNKQ KNKCEI PS EE FS KDVAA 
AAADS PNKH VSVPDR ITGRAGLKELQAVSPSRIWNFVEINVSME 
ELQKLRRTRI CHIil RPIjDTVLDDSIGCAVWFASRG I GWLVAQEG 
VKS YQSNAKWLTG I GADEQLAG YSRHRVRFQSHGLEGIiNKE IM 
MELGR I S S RNLGRDDRVIGDHGKEARF P FLDENWSFLNSL P IW 
EKANLTIiPRGIGEKLLDRIJAVELiGLTASAJLIiPKRAMQFGSRIA 
KMEKINEKASDKCGRLQIMSLENLSIEKETKI* 


5837 


4792 


903 


NGNAVAQAP VTNCCYLATGS KDQTIR I WSCSRGRGVMI LKLPFL 
KRRGGG I DP TVKERLWLTLHWPSNQPTQIiVS S CFGGELLQWDIiT 
QSWRRKYTLFSASSEGQNHSRIVFNLCPLQTEDDKQIilXSTSMD 
RDVKCWDIATLECSWTLPSLGGFAYSIiAFSSVDIGSLAIGVGDG 
MIRVWNTLS IKNNYDVKNFWO^VTCSKVTAiCWHPTKEGCLAFGT 
DDG KVGIiYDTYSNKP PQISST YHKKTVYTLAWGPPVP PMS IjGGE 
GDR PSLAL YS CGGE G I VLQHN PWKIjSGEAFDI NKIiI RDTNS I KY 
KLPVHTE I SWKADGK r MALGNEDGS I E I FQ \ I PNLKL I CTIQQH 
HKL VNTI S WHHE \HGS PAQKLS YI»\MPSGSQQCS PFTCHNLKNC 
P* KAAPES PSDPLQS PYRTPPQGHTAQD YPVWAWEPH IH * WEGL 
VFCFP IDG YS PGCWD \ AFPGKEAP VAI FRG\HQGRLLCVAWSPL 
DPDCI YS G \ ADDFCVHKWLTSMQDHSR P PQGKKS IELEKKRLS Q 
PKAKPKKKKKPTIiRTPVKLESIDGNEEESMKENSGPVENGVSDQ 
EGEEQAREPELPCGLAPAVSREPVICTPVSSGFEKSKVTINNKV 
1 LLKKEPP KEKPETI* IKKRKARS LLPLSTS LDHRSKEELHQDCI* 
VLATAKHSRE LNED VSADVEERFHIiGLFTDRATLYRM ID I SGKG 
HLENGHPELFHQIjMLWKGDLKGVLQTAAERGELTDNLVAMAPAA 
GYHVWLWAVEAFAKQLCFQDQYVKAASHLLS IHKVYEAVEUUKS 
NHFYREAIAIAKARLRPSDPVIiKDbYLSWGTVLERDGHYAVAAK 
CYLGATCAYDAAKVIAKKGDAASLRTAAELAAIVGEDELSASLA 
LRCAQELLlJ^NNWVGAQ2ALQIJiESIJQGQRLVFCLLEI>LSRHLE 
EKQLSEGKSSSSYHTWNTGTEGPFVERVTAVWKSIFSLDTPEQY 
QEAFQKLQN I KYPSATNUTPAKQI»I*1jHI CHDLTLAVLSQQMASW 
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ID 

NO! 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=rAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L»Leucine, M^Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DE AVQALLRA WRS YDSGS FT I MQEVYSA FLPDGCDHLRDKIjGD 
HQS PATPAFKSLEAFFLYG RLYEFWWSLSRPCPNSSVWVRAGHR 
TLS VE PSQQLDTASTEETDPETSQP EPNR PS ELDIiRLTEEGERM 
LSTFKELFSEKHASLQNSQRTVAEVQETLAEMIRQHQKSQLCKS 
TANGPDKNBPEVEAEQPLCSSQSQCKEEKNEPLSLPELTKRIiTE 
ANQRMAKFPES I KAW P FPDVLE CCLVLLLI RSHF PGCLAQEMQQ 
QAQELLQKYGNTKTYRRHCQTFCM 


5838 


110 


98 


KTMPHLLVTFRDVAIDFSQEEWECLDPAQRDLYRBVMLENYSNL 
ISLDLESSCVTKKLS P EKEI YEMES \ PSGRI VIGNVSTITFQYNG 
I/3DNMECKGNLEGQVS KSEGLYMCVKI TCEE KATESHSTSSTFH 
RI I /HYQGKI VKCKE CRQG FS YLSCLI QHEEWHNI * KCSEVNKH 
RNTFSKKPSYI*HQ\KFRIjGEKPYECMECGKAFGRTSDI»IQHQK 
IHTNEKPYQCNACGKAFIRGSQIiTKHQRVHTGEKPYDCKKCGKA 
FSYCSQYTLHQRIHSGEKPYECKDCGKAFILGSQLTYHQRIHSG 
EKP YECKECGKAF I LG SHI»TYHQRVHTGEKP YI CKECGKAFLCA 
SQLNEHQRIHTGEKPYECKECGKTFFRGSQLTYHLRVHSGERPY 
KCKECGKAFI SNSNL I QHQRIHTGEKP YKCKECGKAF I CGKQLS 
EHQRIHTGEKPFECKECGKAFIRVAYIiTQHEKIHGEKHYECKEC 
GKTFVRATQI>TYHQRIHTGEKPYKCKECDKAF/HI*WLTILSEHQ 
RIHRGEKPYECKQCGR/LFIRGSHL/NEHLRTHTGEKPYECKEC 
GRAFS RGSEHTLHQR I HTGEKP YTCVQCGKDFRCPSQLTQHTRI, 
HN* E YSSHKICMHS IALAS LDFAHLQEKNPEN 


5839 


1 


2425 


GRPFPRPPRAI»PRLPLRGRRQDGRWTVDFEECLKD\SPRFRAAI* 
EEVEGDVAEI»ELKL\ DKLVKLC I A\M I DTG KAFCVANKQ FMNG I 
RD\IAQNS \NNDA\ WETKFAPSFLDSLQEM INFHTIL/I,* PJ5S 
EIN* GHS FQNFVKEDtiRKFKDAKKQFENSQ* KRKKIALVKNAPV 
PSRPASIjEL*KPPMILTATRKCFRHIALDYVLQINVIjOSKRRSE 
I LKS WhS FMYAKIjAFFHQG YDLFS EIjG P YMKDLGAQLDRLVGDA 
AKEKREMEQKHSTIQQKDFSRDDSKLKYNVDAANGIVMEGYLFK 
RASNAFKTWNRRWFS IQNNQWYQ KKFKDNPTVWEDLRI.CTVK 
HCEDI ERRFCFE WSPTKSCMLQADSEKliRQAWI KAVQTSI \AT 
AYREKDDESEKLDKKSSPSTGSIiDSGNESKEKLLKGESAI>QRVQ 
CIPGNASCCDCGLADPRWAS INLGITI>CIECSG IHRSLGVHFSK 
VRSLTLiyrWEPEIjLKIjMCEIjGNDVINRVYEANVEKW3IKKPQPG 
QRQEKEAYIRAKY VERKFVDKIFL* SLSPP\EQQKK\ FVSKSSE 
EKRLS I SKFGP \GDQVRASAQSSVRSRDSGIQQS SDDGRBSLPS 
TVSANSLYEPEGERQDSSMFLDSKHLNPGLQLYRASYEKNLPKM 
AEAlxAHGADVNV7ANSF^NKATPLIQAVI^GSLVTCEFLI>QNGAN 
VNQRD VQGRG PI*HHATVLGHTGQVCI»FI*KRG ANQHATDEBGKDP 
LS I AVEAANAD I VTLIjRLARMNEEMRESEGL YGQPGDETYQD I F 
RDFSQMASNNPEKLNRFQQDSQKF 


5840 


698 


3610 


KHLHLPRQHIiTTLWQISSPRWRSpQRAFMSALSKTQTQSAPALQ 
GLSSLLQSVTGNPVPASEAASQSTSASPANTTVYTI kgrklpss 
AQP F I P KS FN YS PNS STS E VS STSAS KAS IGQS PGLPSTAFKL P 
SNTKG FTATHNTS P AAP P TE VTTCQS SE VSKPKI>\ ES ESTS P s l 
\2^IHl»FIJCGNPGFSVA*KrLKHPKPAGSLGSSAPSESHPSDFQ 
RGPTSTSIDNIDGTPVRDERSGTPTQDEMMDKPTSSSVDTMSLL 
SKI IS PGSSTPSSTRSPPPGRDESYPRELSNSVSTYRPFGLGSE 
SPYKQPSDGMERPSSLKDSSQEKFYPDTSFQEDEDYRDFEYSGP 
PPSAMMNLQKKPAKS IL.KSS KLSDTTEYQPILS SYSHRAQEFGV 
KSAF PPSVRALLDS S ENCDRLS SS PGL FGAFS VRGN E PGSDRS P 
SPSKNDSFFTPDSNHNSLSQSTTGHLSLPQKQYPDSPr-IPVPHRS 
LFSPQNTLAAPTGHPPTSGVEKVI^TISTTSTIEFKNMLKNAS 
RKPSDDKHFGQAPSKGTPSDGVSLSNLTQPSIiTATDQQQQEEHY 
R I ETRVS S S CLDLPDS TEEKGAP I ETLG YHSASNRRMSGEP IQT 
VES IRV PGKGNRGHGREAS RVGWFDLSTSGSSFDNGPS SAS ELA 
SLGGGGSGGLTGFKTAPYKERAPQFQESVGSFRSNSFNSTFEHH 
LPPSPLEHGTPFQREPVGPSSAPPVPPKDHGGIFSRDAPTHLPS 
VDLSNPFTKEAAIAHAAPPPPPGEHSGIPFPTPPPPPPPGEHSS 
SGGSGVPFSTPPPPPPPVDHSGWPFPAPPLAEHGVAGAVAVFP 
KDHSSLLQGTLAEHFGVLPGPRDHGGPTQRDLNGPGLSRVRESI, 
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corresponding 
to first 
amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
T/=I,eucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y -Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TIiPSHS LEHLGPPHGGGGGGGSNSSSGP PLGPSHRDT I S RSGI I 
LRSPRPDFRPREPFLSRDPFHSLKRPRPPFARGPPFFAPKRPFF 
PPRY 


5841 


1908 


762 


GLRLFLVLTVW PMMKPSWLSRTEFS KRLLCRTLWCQSGWSSRS Y 
TRSMLKMTTS I NRRS RTS TKSTRTS ARPGLTATVS IGLSDS PTW 
RHCWMTARSCSGEKGGHWAPRQVGVYLLPGRVGCVSSRVSPSFP 
GDGLDSGLARRGSAVSALASGLVEEPMLGPPFHPTPRFKAVSAK 
SKEDLVSQGFTEFTIEDFHNTFMDLIEQVEKQTSVADLIASFND 
QSTSDYLWYLRLLTSGYLQRESKFFEHFIEGGRTVKEFCQ\QE 
\VEPMCKESDHIHIIALAQGLQRVHPGWEYMGPRPRAATTNPHI 
FP* GLPS PKVYDIiYRPG\HYD I LYKIGLGSSPLGCPGCPLLARA 
LGHCYRGFSVWKWSYFTPFFLSHDPPPMFY 


5842 


307 


1918 


QEPTADFKLR STCGCGREMTCPDKPGQL I NW F I CS LC VPRVRKL 
WSSRRPRTRRNLtjLGTACAIYLGFLVSQVGRASLQHGOAAEKGP 
HRSRDTAEPS FPEI PLDGTLAPPESQGNGSTLQPNWY ITTiRSK 
RSKPANIRGTVKPKRRKKHAVASAAPGQEALVGPSLQPQEA\EG 
KLMli*HLGTLREQTWIfRLESDPGGWCGVRE/WRAGGFDFLQPSS 
RESNIRI YSESAPSWLSKDDIRRMRLLADSAYAGDRPVS SRSGA 
RLLVLEGGAPGAVLRCGPS PCGLLKQPLDMS EVFAFHLDRI LGL 
NRTLPS VSRKAEFIQDGRPCPI I LWDASLSS ASNDTHS S VKI>TW 
GTYQQLLKQKa^QNGRVPKPESGCTEIHHHEWSKMALFDFLLQI 
YNR bDTNCCG FRPRKEDAC VQNGLRPKCDDQGSAA1AH 1 1 QRKH 
DPRHLVFIDNKGFFDRSEDNLNFKLLEGIKEFPASAVYVLKSQH 
LRQKLLQSLFLDKGYWESQGGRQGIEKLIDVIEHRAKILITYIN 
AHGVKVLPMNE 


5843 


500 


1453 


GTARIiVTCWVIJiGQ*VKKPAMEPGWWL*Q*RCRPKGWGLGAGM 
R3SRMS0PPQCLRRAQSSCCHFMVKLLDDGTFMIPGEKVAHTSL 
DAIjVTFHQQKP I E PRRELLTQPCRQKD? ANVD YED LFLYSNAVA 
EEAACPVSAPEEAS PKPVLCHQSKERKPSAEM / RQNNHQGSHFL 
LPPKIPSWRDPPETLEEPQNAPRERPEGPAAAKKPPRHCELWT 
LGCPE I HGDLRPWDRKRQPRSLRGSHLGGQRLHGSLCGHI SQKP 
LTAPGTKRQKG PHQEGRE VGQLH* GD PRGQELAPNGS ES P I LPG 
VQARAPGLGRA 


5844 


202 

- 


2471 


FDSAVLSSINVMAVUPGPLQLLGVLLTISIjSSIRIjIQAGAYYGI 
KPLPPQ I PPQMPPQI PQYQPLGOQVPHtaPLAKDGLAMGKEMPHL 
QYGKE Y PHLPQ YMKE I Q PAPRMGKEAVPKKGKE I PLASLRGEQG 
PRGEPGPRGP PGPPGLPGHGI PGIKGKPGPQGYPGVGKPGMPGM 
PGKPGAMGMPGAKGEIGQKGEIGPMGI P* PQGPPGPHGLPGIGK 
PGGPGLPGQPGPKGDRGPKGLPGPQGLRGPKGDKGFGMPGAPGV 
KGPPGMHGP PGP VGL PG VG KPGVTGFPG P \QGPLG K\ PGAPGEP 
GPQGPIGVPGVQGPPGIPG IGKPGQDG\ I PGQPGFPGGKGEOGI* 
PGLPG P PGLPG I GKPGFPG P KGDRGMGGVPGALGPRGE KGP IGA 
PGI GGP PGE PGliPG I PGPMGPPGAIG F PGPKGEGG I VG PQG P PG 
PKGEPGLQGFPGKPGFLGEVGPPGMRGFPGPIGPKGEHGQKGVP 
GLPGVPGLLGPKGEPGIPGDQGLQGPPGIPGIGGPSGPIGPPGI 
PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPQGQPGL 
PGP PGP PG P PG P PA VMPPTPP PQGEYLPDMGLGIDGVKP PHA YG 
AKKGKNGGPAYEMPAFTAEIiTAPFPPVGAPVKFNKIjIiYNGRQNY 
N PQTG I FTCE VPG VY YFAYJHVHCKGGNVWVALFKNNEPVM YTYD 
EYKKGFIiDQASGSAVLLLRPGDRVFLQMPSEQAAGLYAGQYVHS 
SFSGYLLYPM 


5845 


215 


2061 


HASNKSAS LQD KMAN P KEKTAM CLVNELARFNRVQ PQYKLLNER 
G PAHSKMFSVQ LSLGEQTWESEGS S I KKAQQ AVGNKALTESTLP 
KPI * KPPKSNVNNNPGCITPTVELNGLAMKRG\ KPAIHRPLDPK 
P FPNNRANYN FQVM YNQRYHCP I PKI FYVQLTVGNNEFFGEGKT 
RQAARHNAAMKALQALQNEP IPERS PQNGESGKDMDDDKDANKS 
E ISLVFE IAIiKRNMP VS FEVI KESGPPHMKS FVTR VSVGE FS AE 
GEGNSKKLS KKRAATTVI#ELKKLPPLPWEKPK\HFFKKRPKT 
IVKAGPEYGQGMNPISRLAQIQQAKKEKEPDYVLLSERGMPRRR 
EFVMQVKVGNEVATGTGPNK3CIAKK^AAEAMLLQLGYKASTNt>Q 
DQLEKTGENKGWSGPKPGFPEPTNNTPKGILHLSPDVYQEMEAS 
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Ammo acid segment containing sxgnal peptide" - 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
II=Histidine, I=Isoleucine, K=Lysine, 

I> = Leucine M-Mpfhinn{ no tot 7v»-.->~,-»-*--»^»-; ~ 

^ r *• — i'icuuxuiij.iic^ jw=/\sparagxne , 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T=Threonine / V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RHKVISGTTLGYLS PKDNWOP^ S Q gF<; t"q pt smc c a t't a ntr t' m — 
NGTSSTAEAIGLKGSS PTPPCSPVQPS KQIiEYLARIQGFQVHYC 
DRQSGKECVTCLTLAPVQMTFHAIGSS IEASHDQV* YATAILLC 
YG PAR KWKA I KMEAM CAHAALLS L IHYLLAPS ARLE KS KLFALG 
N- 


5846 


1126 


456 


FS KLI K KTF 1 1 GI SG VTN SG KTTLAKNLOKHL PNCS V I S ODDFF 

KPESE I ETD KNGFLQ YD VLEIALNMEKMMS AI S CWMES ARHS VVS 

TDQES AEE I P I !•! I EGFLLFN YKPI»DTI WNRS YFLTIP YSECKR 

RRSTR VYQ P PDS PG YFDGHVW PM YLKYRQEMQD I TWEW YLDGT 

KSEEDLFLQVYEDLIQELAKQKCLQVTA*RRNTTNPS/CK+IRK 
LQGVI 


5847 


2769 


505 


AP EMEDLS SPDSTLLQGGHNLLSSAS FQES VTFKDVI VDFTQEE 
WKQLDPGQRD1»FRDVTLENYTHI*VS IG1.QVS KPDVISQLEQGTE 
PWIMEPSIPVGTCADWETRLENSVSAPEPD1SEEELSPEVIVEK 
HKRDDSWSSNLLESWEYEGSUERQQANQQTLPKEIKVTEKTIPS 
WE KGPVNNEFGKS VNVSSNLVTQEPS PEETSTKRSI KQNSNPVK 
KEKSCKCNECGKAFSYCSALIRHQRTHTGEKPYKCN*/CVEKAF 
SRSENLINHQRIHTGDKPYKCDQCGKGFIEGPSLTOHQRIHTGE 
KPYKCDECGKAFSQRTHLVQHQRIHTGEKPYTCNECGKAFSQRG 
HFMEHQKIHTGEKPFKCDECDKTFTRSTHLTQHQKIHTGEKTYK 
CNE CG KAFNG P STFIRHHMIHTGE KP YECNECG KAFSQHSNLTQ 
HQKTHTGEKPYDCAECGKSFSYWSSLAQHLKIHTGEKPYKCNEC 
GKAFS YCS S IiTQHRR IHTREKPFE CS ECGKAFS Y LSN LN QHQKT 
HTQEKAYECKECGKAFIRSSSLAKHERIHTGEKPYQCWECGKTF 
SYGSSLIQHRKIHTGERPYKCNECGRAFNQNIHLTQHKRIHTGA 
KP YECA3CG KA FRHCSS LAQHQKTHTE E KPYQCNKCEKT FS Q S S 
HLTQHQRIHTGEKPYKCNECDKAFSRSTHLTQHQRIHTGBKPYK 

CNECGK\TFSQSTYIjIQHQRIHSGEKPFGCNDCGKSFRYRSAIiN 
KHQRLHPGI 


584B 


22 


2961 


AAPRRLuRGGDGDRTPRFPLPAtjLRPGPPAEAAPERRKMPAVSlC" 
GDGMRGIiAVF 1 SD I RNCKSKEAE I KR I NKEIjAN IRS KFKGDKAL 

D3YSKKKYVCKLLFIFLLGHDIDFGHMEAVNLLSSNRYTEKQIG 

YLF ISVLVNSNSELI rlinnai kndlasrnptfmglalhciasv 
gsremaeafageipkvlvagdtmdsvkqsaalcllrlyrtspdi* 
vpmgdwtsrvvhlijtoqhi^wtaatslittlaqknpeefktsv 

SLAVSRLS \RIVTSASTDLQDYTY* FCPGFLGLSVKLLRLLQCY 
P PPDPAVRGRLTECIiETIIiNKAQEPPKS KKVQHSNAKNAVLFEA 
I St*I IHHDSEPNLIjVRACNQIKSQFI^HRETNI^YIiALESMCTLA 
SSEFSHEAVKTHIETVINAUCTERDVSVRQRAVDIiLYAMCDRSN 

a?q ivaemls ylietadys iree i vlkva i laekyavdytw\ yvd 
til^iriagdyvseevwyrviqxvinrddvqgyaaktvfealq 

APAC!HFNI>VK"W^f3VTT ./^Cir/TJT T t\ nr\r>T> o <-»t^it -i-^-vrwr* t 

■rurru-n n.vi AivrwV V^Vj * J. liOCt t*J>l Li± AGDPRS S P JblQ FHltLHS KFHI* 

csvptralllstyikfvnlfpevkptiqdvlrsdsqlrnadvel 
qqraveylrlstvastdilatvlehmppfperessilaklkkkk 
gpstvtdledtkrdrsvdvnggpepapastsavstpspsadllg 
lgaappapagpppssggsgllvdvfsdsaswaplapgsednfa 
rfvcknngvlfenqllqiglksefrqnlgrmfifygnktstqfl 

NFTPTLI CSDDLQPNLNT OTKPVDPTVEGGAQVQQWNI ECVSD 

fteapvlniqfryggtfqnvsvqlpitlnkffqptemasqdffq 
rwkqlsnpqqevqn I fkakhpmdtevtkaki igfgsalleevdp 

NPANFVGAGIXHTKTTQIGCLLRLEPNLQAQMYRIjTLRTSKEAV 
SQRLCELLSAQF 


584S 


3545 


1895 


KRREIKETVFHHVAQAGLELLSSSWPPSSASRSAGITGMRHQVQ 
P *DPC!MSLS P PCFTEEDRFSLEALQTIHKQMDDDKDGGI EVEES 
DEFIREDMKYKX3ATNKHSHTJIREDKHITIEDLWKRWKTSEVHNW 
TLEDTLQWLIEFVELPQYEKNFRDNNVKGTTLPRIAVHBPSFMI 
SQLKISDRSHRQKLQLKALDVVLFGPDTRPPHNWMKDFILTVSI 
VI G VGG CW F AYTQNKTS KEHVAKMMKDIjES LQTAEQSLMDLiQER 
I*E KAQEENRNVAVEKQNL* RKMMDE I N YAKEEACRLREljREGAE 

celsrrqyaeqeleqvrmalkkaekefelrsswsvpdalqkwlq 
lthevevq yyni krqnaemqlaiakdeaeki kkkrstvfgtlhv 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Ala nine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Hxstidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V^Valine, 
W-Tryptophan, Y -Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AHSSSLDEVDHKILEAKKAI^ELTTCLRERLFRWQQIEKICGFQ 
IAHNSGLPSLTSSLYSDHSWVVMPRVSIPPY?IAGGVDDLDBDT 
PP I VSQ FPGTMAKPPGSLARSS SLCRSRRS I VPSS PQPQRAQLA 
PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPALYRNEEE 
EEAIYFSAEKQWEVPDTASECDSLNSSIGRKQSPP/SKPRDIPN 
1 1 S/DERYQEMRCP+ RI PSGGII> 


5850 


3 


1895 


KAVLNFSASGSVISLTGSNPMHDASMWHLKKNGIIVYLDVPIiLN 
LICRLKLMKTDRIVGQNSGTSMKDLLKFRRQYYKKWYDARVFCE 
SGAS PEEVADKVIiNA I KR YQDVDS ETFI S TRHVWP EDCEQKVSA 
EFFI EAV I EGLASDGGLF VPAKEFPKLS CGEW KS LVGAT Y VERA 
Q I LLERCI HPAD I PAARLGEM I ETAYGENFACS KI AP VRHLSGN 
QFILELFHGPTGSFKDLSLQLMPHIFAQCIPPSCNYMILVATSG 
DTGSAVLNGFSRLNKNDKQRIAWAFFPENGVSDFQKAQIIGSQ 
RENGWAVGVESDFDFCQTAI KR I FNDSDFTGFIiTVEYGTI LSSA 
KSINWGRLLPQVVYHASAYLDLVSQGFISFGSPVDVCIPTGNFG 
KILAAVYAKMMGI P IRKFI CASNQNHVWTDFI KTG\HYDLRGKE 
R*AQTFFTVQ* I FLPNLSNLERHLHLMANKDGQLMTELFNRLES 
QHHFQIEKALVEKLQQDFVADWCSEGECLAAINSTYNTSGYILD 
PHTAVAKWADR VQDKTCP VI I SSTAH YS KFAP AIMQALKI KE I 
NETS S S QLYLLGS YNALPP LHEALLERTKQQ EKME YQVCAADMN 
VLKSHVEQLVQNQFI 


5851 


3120 


1802 


RCYLQFLALLLTSTSARAAAAIAAAEE PAGS PS VMTRAGDHNRQ 
RGCCGS LAD YLTSAKFLLYLGHSLSTWGDRMWH FAVSVFLVELY 
GNSLI,L.TAVYG1jWAGSVLVLGAIIGDWVDKNARI*KVAQTSLW 
QNVSVI LCGI ILMMVFLHKHELLTMYHGWVLTSCYILI ITIANI 
ANLASTATAITIQRDWI WVAGEDRSKIiANMNATI RR IDQLTNI 
£J\PMAVGQIMTFGSPVIGCGFISGWNLVSMCVEYVI»LfWKVYQKT 
PAIAVKAGLKE EETELKQLNLHKDTE PKPLEGTHLMGVKDSNIH 
ELEHEQEPTCAS QMAEPFRTFRDGWVS YYNQPVF/ LGWHGSCFP 
LYDCPGIj* LHHHRVRLHSGTEWFHPQ Y FDGS ISYNWNNGNCS FY 
LATS KM WFGSDRSDLRIGTAFIiFDIiVCDLC I HAW K P PGLVRFS F 


5852 


1 


422 


KTTFPSSLCPLRQLPEVRGYSGQPLTDPLISLCRSHKCRGKGWG 
SSS YPSLPALLRARSAPGHCTHRSCGPEWRIDS I S RLEMQGARR 
SGWAQAQPTI LLLVPRLRKSLPS IWG /SLMGFFI TSGPG/ WFRQ 
YYFFI SGRH+ VLFTESDFYYVAMDFGGHGL3SHYS PGVPYYLQT 
FVSB IRRWAGKKQSVYFRRCGGCSRAP PLITGGGVGSRKQRWP 
ESGAWAIiAPGL PAIHGRS WES 


5853 


223 


1346 


RbLGLSRVKGLHGPAASAWISDPETRGDPGGPWGMWRGSDLRPR 
PVSLTGIjTOVCK*AAQGPQV\HSVKLCFGLGG\PCLL\FPIFRP 
LLLKPRRPRLHPGTRGVAVEPHALRWHVAHGEEAGI RAAGPGH 
GGVE I PQG / VGS LGARRGLRPSRPSSRHRNRVPAPP PGRPLATP 
HRRRFPPDPALTCPGLGQDOGPREQQKQGSGRHDTILGDWGESE 
SRWVRGN FRTGTAATLIG FSRN PTLNGSENWGSLVS I QEEGPDT 
GWEREKRNPAEMGNPQRWASPIHTPPLGPEILRAMPEALRAMPE 
ALGLRPDPATSVPSALS/QTF/PESWPRSCLRNQGETLGMGPVP 
LSSLCITESPSQNWTPCLLLLTCPRGLF 


5854 


86 


938 


KGRNTAPEKKGAAjLNNRENASS*NGY/SRWKQDIRRIENHIIQE 
LXHLCAMIKRVLLERriENTRKLRELTEGRTLDWPQNRITEVSAK 
RQI VTE YREKGKRN* EEKKRDLEGRSRRYNLCI IG I PETEDRAS 
GAETI KDliLE/ ENFPELKNELDIiQMEKAHR I PLKFNEKKAASRH 
I RVTFL/ KFQRRN I LQASSQRKQVTYKGAKVRLTS DFS PAILNA 
RRQW/N/ P I SRVLRENN FEPR 1 1 YS AKLS FLY KGN WKT FLD IQG 
LGKYINQELSLKILLKDLLQLTENLN 


5855 


536 


2391 


LRSYGCKAPSRISHLHK\FLFLLLPSLLMGYSESPPPITDSWAP 
FISLTHHVLSQSQS PLS SNCWI CLSTHTQ* FTALPADLLTWTQS 
NVSLH I S YLAI PFLADSFLKPV/L* PGKSAKHLS FKLSSLSMVS 
GRAVAIiLHLIASGLTSIQTNTASSKPPIWGY\LSTOTSFISPPP 
LCLSRTYPNPAHATMVGQVPQS LCGLI FTL/ RTPCRPS I LHPNY 
KI I STSAWQKVLCFSGSPTIHTSLHLTTGSSFLSFHP I PGFPAA 
NSALYVSSLKGPPGKNVTIPSPVTGT*QPPHRGSN/RLTVDKDN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide '" 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine / K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
n ixyptopnan, r-ryrosme, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FFLSPKPNsJjHQLPSQ\TPYQALTGAALAGSYPIWENENTLSWL 
PT FTYN FCLST PS L F FLCDTN * YLCL PAN WS GT CTL VFQAPT I N 
ILPPNQTILISVEASISSSPIRNKWALHLITLLTGLGITAALGT 
GIAGITTSITSYQTLFTTLSNTVEDMHTSITSLQRQLDFLVGVI 
LQNWRVLDLLTTEKGGTCIYLQEECCFCVNESGIVHIAVRRLHD 
JWi£ ' 1J " vJ v ADb W WQGSS LLRV7 1 P W VAP FLG PLI FL FLLLM I G P 

C1FNLVSRFISQRLNCFIQASMQKHIDNIFHLCHV* YQSLRGNH 
SEAPEPRP 


5856 
5857 


173 


1137 


PWLHGLGLSAVFI,FY2j* /YVTFHLYGGI ILLLIiIFIS IAGILYK 
FQD VLL YFPEQ PSS S RL Y VPM PTG I PHEN I FI RTKDG 1 R LNLI I* 
IRYTGDNS PYS PTI I YFHGNAGNlGHRLPNALLMLVNLKVNLLL 
VDYRGYGKSEGEAS EEGbYLDSEAVLDYVMTS PDLDKTK I YLSG 
RSLG\GAAAIHuASDNSHRISAIMVENTFLSIPHMASTLFSFFP 
MRYliPLWCYKNKFIiSYRKISQCRMPSLFISGLSDQLIPPVMMKQ 
LYELS PSRTKRLAI FPDGTHNDTWQCQGYFTALEQFI KEWKSH 
SPEEMAKTSSNVTI I 


5858 


1597 


563 


KlilGKVLVLSWADAMAAFAVEPQGPALGSEPMMLGSPTSPKPG 
VNAQFLPGFLMGDLPAPVTPQPRS ISGPS VGVMEMRS PLLAGGS 
PPQPWPAHKDKSGAPPVRSIYDDISSPGLGSTPLTSRRQPNIS 
VMQS PLVGVTS TPGTGQSMFSPAS 1GQPRKTTLS PAQLDP FYTQ 
GDSLTS EDH\ LDDSWGDCI WGFLKASA\SY I LL\QFAQYGGIS * 
NMWMSNTGNWMH I RYQS KLQAR KALS KDGR I FGES I M IG VKPCI 
EKSVMESSDRCALSSPSLAFTPPIKTLGTPTQPGSTPRISTMRP 
LATAYKASTSD YQVI S DRQTPKKDESLVSKAMEYMFGW 


5859 


355 


1419 


PPHQ PAAAS TSXHQQQQ P P P PPQDS S KP WAQG PGPAPG VGS AP 
PASSSAPPATPPTSGAPPGSGPGPTPTPPPAVTSAPPGAPPPTP 
PS SGVPTTP PQAGGPP P P PAAVPG PG PG PKOG PG PGGP KGGKMP 

GGPKPGGGPGLSTPGGHPKPPHRGGGEPRGGRQHHPPYHQQHHQ 
GPPPGGPGGRSEEKISGPRRGFKANLSLLRRPGEKTYTQRCRFC 
LLG I YLLI S RRMNSRRL FAKI WENQEKFLS TKAKDSEF I KliESR 
ALA * N CPKFELG * YTP * GGRQLPS S LFPTHACLPLS CS VI FS P F 

MPPQ*NCWGRKPFRPNLGPHLKGAVCNRWDDPWEGPTGKGHCIiN 
FAS 


5860 


307 . 


1503 


ggssarprassrj^lsrkktknevskpaevqgkyvkket'spllr 

NLMPS FIRHGPTI PRRTDICLPDSSPNAFSTSGDGWSRNQSFTj 

rtpiqrtpheimrresnrlsapsyiarsiadvpreygssqsfvt 
evsfavengdsgsryyysdnffdgqrkrplgdrahedyryyeyn 
hdlformpqnqgrhasgigrvaatslgnltnhgsedlplppgws 

VDWTMRGRKYYIDHNTNTTHWSHPLEREGLPPGWERVESSEFGr 
YYVDHTNKKAQ Y\ RH PCAFTCTS V* STTS CHI / AS /RQQTERNQ 

SLLVPANPYHTAEIPDWLQVYARAPVKYDHIbKWELFQLADLDT 

YQGMLKLLFMKELEQIVKMYEAYRQAIiLTBLENRKQRQQWYAQQ 
HGKNF 


5861 


2956 


1270 


TI RVEEF P LCPGGGKAQLS S ASLLGAGLLLQPPTPPPLjLLLLFP 
LLLFS RLCGALAGPI I VEPHVTAVWGKNVSLKCLI EVNETI TQI 
j-k, Ki>i>QT VAVHHPQYG FS VQGE YQGR VLFKNYS LNDA^ I 
TLHNIGFSDSGKYICKAVTFPLGNAQSSTTVTVLVEPTVSLIKG 
PDSLIDGGNETVAAXCIAATGKPVAHIDWEGDLGEMESTTTSFP 
NETAT 1 1 SQ YKLFPTR FARGRR I TCVVKH PALEKD I R YS FI LD I 
QYAPEVS VTG YDGNW FVGRKGVNLKCNADANPPPFKS VWSRLDG 
QWPDGLIiASDNTLHFVHPIjTFNYSGVYICKVT\NSPGS kevtqk 
VHPTFQDPSLPTYP PLPALQ FQWAS PSTA * TSRD \ LATE P * KI A 
PSPLSTL\ATIKGWTQLPTIIA*CSGVGALFIV\LVKCFGLGIF 
CYRRRRTFRGD YFAKNY I PPSDMQKESQIDVLQQDELDP YPDS V 
KKENKNP VNNLIRKD YLE EPE KTQWNNVENLNR FERPMDY YEDL 
KMGMKFVSDEHYDENEDDLVSHVDGSVISRREWYV 




2051 


1305 ) 
J 


EVCACVQAFWLVASSGDDSQGGDKCGCEVGSlflVGSMRVVMARLL 
S EGEQG I PTACAAFAQQPAG/ E PRRGLAGVGEGGPQCS WVNYRC 
rLEFLVSLLGTDLARGRGNSASGPTAPADSKQL/ML*DVHRRVI 
L»E * RMNSGSPARDNAPSQRFCTNLSEGLRFG IS PSWREAL YGCH 
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CPA 

ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

c o rre spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, j 
L=I*eucine, M=Methionine , N=Asparagine, 
P^Proline, Q-Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Cod on, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








A 


5862 


1556 


483 


PPFQL I MGE I KVS PDYNWFRGTVPLKKI I VDDDDS KIWSLYDAG 
PRS IRCPLI F1>P PVSGTADVFFRQ I LALTGWGYRVI ALQYPVYW 
DHLE FCDGFRKLLDHLQliDKVHLFGAS LGGFLAQKFAE YTHKSP 
RVHSLI LCNSFSDTS I FNQ TWTANS F WLMPAFMI>KKI VLGNFSS 
GP VDP MMADA I D FMVDRLES LGQS EIiASRLTLNCQNS Y VE PH KI 
RDI PVTIMDV FDQS ALSTEAKEEMYKLYPNARRAHLKTGGNFPY 
LCRSAEVNLYVQIHL/R /RNSMEPNTRPLTHQVISVPRS LRCRKA 
ALASARRSSS VSIiAVNDELTRCVLV* SVAS APVSRPFPSGS SGS 
PVLTVSGK 


5863 


2714 


249 


PFPSRGSLPLAAPREDTMGPLMVLFCLL.FLYPGLADSAPSCPQN 
VNISGGTFTLSHGWAPGSLLTYSCPQGIjYPSPASRLCKSSGQWQ 
TPGATRSLSKAVCKPVRC PAPVS FENG I YTPRLGS YPVGGNVSF 
ECEDGFI \LRGS PVRQCRPNGMWDGETAVCDNGAGHCPNPG I SL 
GP\VRTGFRFGHGDKVRYRCSSNLVLTGSSERECQGNGVWSGTE 
PICRQPYSYDFPEDVAPALGTSFSHMLGATWPTQKTKESLGRKI 
QIQRSGHLNLYXiLIjDCSQS VS ENDFLI FKESASLMVDR I FS F3I 
NVS VAI I TFAS E P KVLMS VLNDNS RDMTE V IS S IiENAN YKDH3N 

gtgtntyaalnsvylmmnnqmrllgmetmawXqei RHAI I LL\T 

DGK\SHMGGS PKTAVDHIRE I LN INQKRNDYLD I YAIG VGKLDV 
DWRELN2LGS KKDGERHAFI LQDTKALHQVFEHMLDVS KLTDTI 
CGVGNMS ANASDQERTPWHVTIKPKSQET\ C\RGAI,I SDQWVLT 
AAHCFRDGNDHSLWRVNVGD PKSQWG KE FLI E KAVI S PGFD V FA 
KKNQGIL\EFYGD\DIAliL\KIAQKVKTi\STHCOGPSCLP\CTM 
\EANLGFIiRETFKGSTCR\DHENEI*/\A , ?NKQSV\PAHF\VAIj\lSf 
GSKI*EHLTIiRMGVEWTSCCRGLSPKKKTM\FPNI,T\DVRE\WT 
D\ QFI» \ CS \G PQEDESP \ CK * E\ SGGA\ VFLERR FRJbSAGG VWC 
SWGL.\YNP\CT^SA\DKNSPKKGPSVAKVPPPTR/DFHIN\LFP 
Q*SPWLRQHPGGMS * I FI>PLLANGHLS P FACPAR I CRPLKFLPS 
EWATLRTL 


5864 


173 


1013 


PLISVPQSLISLPQPbbCFPGGQEPSAPSPCLYSFLWACSFTMG 
KLPPS IPPSS PIiACVIiKNIiKPLQLTPDLKPKCI.1 FFCNTAW PQY 
KLDNDSK* PENGTFEFS I LQVLDNS CHKMGKWS E VPDVQAFF\ S 
HWSLPSl>CSQC/GUlPNIiSSFSPFCSFG/PPPQVPSP/TESFFS 
MDSSDLPPSPQAAPRQAEPGPNSHLASAPPPYNPFITSPFHTWS 
SLQFHSVTSPPPPAQQFTLKKVAGAKGIVKVSAPFSLSQIR*RL 
GSFSSNIKIQPSSWLIWQQP 


5865 


568 


1684 


CLPGPRWGEGWRAGHTIVGCI FFKTA1 1SHFKGGMYLCVCMCTC 
LSVCVCVQVGSWICV/CVSMCACVSLCTC\ICRCISMYTREHAC 
ACTRV * V YMCMS / VCTCVSTC IDVRVCAHVCVYMCl/CIiGYA* AC 
TCV*MCVCMHEHVa4C/VCACSCVLL/CRGHICM/MCMSAYICI 
/CVYVCVLCVWACMRMSTCVWLVYG * ACTCVWMHM/ CSCTCR/C 
VHVCCMSMHACECLCVYLHI CGCAGTRR WWAGS ARG SRSGSRLP 
CWAPGPGLSLPGPSCPSVEQGLGGGPGQLQGRSGEARLGEHRGW 
G3PAAVCSRNCTVS PRRGADCFE APDVP KQPPGWGRAS FEERG C 
GGRGWVCAPPLKGPQCCCFS I KPELKAKKKK 


5866 


98 


3197 


ARPEVPAP PAWIjSRRGAAKMGDKKDDKDS PKKNKGKERRDLDDli 
KKEVAMTEHKMS VBE VCRKYNTDC VQGI/THSKAQE I I±ARDG PNA 
LTPP PTT P EW VKFCRQLFGGFS ILLWIGAI LCFLAYGI QAGTED 
D PSGDNLYLG I VLiAAWI ITGCFSYYQEAKSSKIMESFKNMVPQ 
SiALi V XKU.Q?h.KMU VNAfcbVVVODLVEI KGC* Uit V PADLR 1 1 S AHGC 
KVDNSSLTGESEPQTRSPDCTHE\NPLKTRNITFFSNNFVEGTA 
RGVWATGDRTVMGR I ATIASGL E VGKTP IAI EI EHF I QLI TGV 
AVFIX5VSFFI LSLI LGYTWLEAVI FLIGI IVANVPEGLLATVTV 
CZiTIjTAKRMARKNC WKNLEAVETLGSTST I CSDKTGTLTQNRM 
TVAHMWFDNQIHEADTTEDQSGTSFDKSSHTWVAI>F*H/LLGFC 
NRPVFKGGQDNI PVLKRDVAGDASESALL KCI ELSSGSVKLMRE 
RNKKVAE 1 P FNSTNKYQL»S IHETEDPNDNR YLLVMKGAPER ILD 
RCSTILLQGKEQPLDEEMKEAFQNAYLELGGLGERVLGFCHYYL 
PEEQFPKGFAFDCDDVNFTTDNLCFVGIiMSMIGPPRAAVPDAVG 
KCRSAGIKVIMVTGDHPITAKAIAKGVGI IFEGNETVEDIAAPJL. 
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SEQ 
ID 
NO j 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K= Lysine, 
I.=Leucine, M=Methionine, N-Asparagine, 
P= Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








N I P VSQ VN P RDAKAC V IHGTDLKDFTS EQ I DEI LQNHTE I V FAR 
TS PQQKL 1 1 VEGCQRQGAI VAVTGDG VNDS PALKKAD I G VAMG I 
AGSDVSKQAADMILLDDNFASIVTGVEEGRLIFDNLKKSIAYTL 
?SNIPEITPPLLFIMANIPLPLGTITILCIDLGTDMVPAISLAY 
EAAESDIMKRQPRNPRTDKLVNERIilSMAYGQIGMIOALGGFFS 
YFVILAENGFLPGNLVGIRLNWDDRTVNDLEDSYGQQWTYEQRK 
WE FTCHTAFFVS I VVVQWADLI 1 CKTRRNSVFQQGMKNKIL I F 
GLFEETALAAFLSYCPGMDVALRMYPLKPSWWFCAFPYS fli fv 
YD E I RKL 1 LRRNPGGW VEKET Y Y 


5867 


3 


1485 


LPGRRARGGRGLGWPPAQALDGSRMGKAKVPASKRAPS S ^ VAKP 
GPVKTLTRKKKTKKKKRFWKSKAREVSKKPASGPGAWRPPKAPE 
DFSQNWKALQEWLLKQKSQAPEKPLVISQMGSKKKPKIIQQNKK 
ETS PQ VKG EEMPAG KDQEASRGS VPSGS KMDRRAP VP RTKASGT 

EHNKKGTKERTNGDIVPERGDIEHKKRKAK\GQPQPHPPR/IDI 
WFDDVDPADIEAAIGPEAAKIARKQLGQSEGSVSLSLVKEQAFG 
GLTRALALDCEMVG VGPXGEESMAAR VS I VNQYGKCVYDK YVKP 
TE PVTD YRTAVSG I RPENLKQG E ELE WQKEVAEMLKGRI LVGH 
ALHNDLKVLFLDHPKKKIRDTQKYKPFKSQVKSGRPSIiRLLSEK 
ILGLQVQQAEIICSIQDAQAAMRLYVMVKKEWESMARDRRPLLTA 
PDHCSDDA*QSCPAAAAAPLQRQCDQSQGQITSPQSGNSGETFS 
ESWQRGVAWCY 


5868 


2122 


833 


LTAGAS HTQDASQSTS A KYPAAAQNL/ CVTNAMREDLADI W YIR 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGL VTRLRERPALLVSSTS WTEDEBFS I LLAA 
LESRV* T\MTLDGHNL PS LVCV I TG KGPLRE Y YSRL IHQKH FQH 
IQVCTP WLEAED YPLLI/GS ADLGVCLHTS S SGLDLPMKWDMFG 
CCLPVCAVNFKCLHELVKHEENGI.VFEDSEEIAAQIiCMLFSNFP 
DPAGKl^QFRKNLRESQQLRWDESWVQTVIiPLVMDT 


5869 
5870 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/ CVTNAMREDLADI WY IR 
AVTVYDKPAS FFKETPIJDLQHRLFMKLGSMHS P FRARS E PE DPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA 
LESRV+ T\MTLDGHNLPSI>VCVITGKGPLRBYYSRLIHQKHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKWDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 




2122 


833 


LTAGAS HTQDASQSTS AKY PAAAQNL/ CVTNAMREDLADI WYI R 
AVTVYDKPAS FFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERS AFTERDAGSGLVTRLRERPALLVS S TS WTE DEDFS I LLAA 
LESRV+T\MTLDGHNLPSLVCVITGKGPLREYYSRLIHQKHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGIiDLPMiCVVDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFBDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5871 


3 


3465 

< 


FFFCRPLRLYSKTTGDRSAMAGAAGLTAEVSWKVLERRARTKRS'" 
VLKLL* LS LRRL * LE PTI *NGLLT*CSRLS VFRFLKV\GSVYEP 
LKS INLPRPDNETLWDKLDHYYRI VKSTLLLYQS PTTGLFPTKT 
CGGDQKAKI QDS L YCAAGAWALALAYRR IDDDKGRTHELEHS AI 
KCMRGILYCYMRQADKVQQFKQDPRPTTCLHSVFNVHTGDELLS 
YEE YGHLQ I NAVS LYLL YLVEM I SSGLQ 1 1 YNTDEVS FI QNLVF 
CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVGLGKRQ 
L * KQFNGFNLFGNQGCSWSVI FVDLDAHNRNRQTLCS LLPRESR 
SHNTDAALLPCISYPAFALDDEVLFSOTl,tlinA7PVT vr-irvr^-cvry 
FLRDG YRTS LEDPNRCY YKPAE I KLFDG I ECEFP I FFL YMM I DG 
VFRGNPKQVQEYQDLLTPVLHHTTEGYPWPKYYYVPADFVEYE 
KNNPGSQKRFPSNCGRDGKLFLWGQALYI I AKLLADEL IS PKDI 
DPVQRYVP LKDQRNVSMRFSNQGPI^EITOLVVHVALI AESQRLQV 
FLNTYG I QTQTPQQVE P I QI WPQQELVKAYLQLG INEKLGLSGR 
PDRPIGCLGTSKIYRILGKTWCYPIIFDLSDFYMSQDVFLLID 
DIKNALQFIKQYWKMHGRPLFLVLIREDNIRGSRFNPILDMLAA 
LKKG 1 1 GGVKVHVDRLQTLISGAVVEQLDFLR I S DTEELPEFKS 
PEELEPPKHSKVKRQSSTPSAPELGQQPDVNISEWKDKPTHEIL 
3KLNDCSCLASQAILLGILLKREGPNFITKEGTVSDHIERVYRR 
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SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide " 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, -^Methionine, N-Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion , 
\=possible nucleotide insertion) 








AGSQKLWS WRRAASIiLS KWDSLAPS ITNVLVQGKQVTLGAFG 
KEEE V I SNPiiS PR V I QN 1 1 Y Y KCNTHD EREAVT QQELVI H I GW I 
I SNNPELFSGTLKIRIGW I IHAME YELQI RGGDKPALDLYQLS P 
SE VKQLLLD I LQPQQNGRCWLNRRQ I DGSLNRTPTGFYDRVWQ I 
LERTPNGIIVAGKHLPQQPTljSDMTMYEMNFSLIjVEDTIiGNIDQ 
PQYRQIVVBLl.MWSIVLERNPEIiEFQDKVDLDRIiVKEAFNEFQ 
KDQSRhKEl EKQDDMTS F YNTPPLGKRGTCS YLTKAVMNIiLIiEG 
EVKPNNDDPCIilS 


5872 


68 


665 


VQGYMYRFVI KINSCYSEKTS I CRHRCCPELPATQPW PTPTVFF 
NIAIDSESLGCl\SFKIiFADKV/PKRWKKNFVLLNTGEKVLGDK 
GPCFYR 1 1 PG \LCQGGD FTHHNGTGGKSL YS KEFDDENF I / bKH 
TAPGVLST ANAG PTTNGS Q F F I CTAKTEDG *QHWFGKVKDGMS 
rVEALERSGSRWGKTSKKITAANCGQL 


5873 
5874"^ 


2240 


506 


RRP PEGGSGGGRRTRARMP LPWSIiALPLIjLS WVAGG FGNAAS AR 
HHGLLASARQPGVCHYGTKLACCYGWRRNSKGVCEATCEPGCKF 
GECVGPNKCRCFPGYTGKTCSQDVNECGMKPRPCQHRCVNTHGS 
YKCFCLSGHMLMPDATCVNSRTCAM INCQYS CE DTEEGPQCLCP 
SS GIiRLAPNGRD CliDIDE CASGKVI C P YNRRC VNTFGS Y YCKCH 
IG FELQ YI SGR YDCID I NECTMDS HTCSHHANCFNTQGS FKCKC 
KQGYKGNGLRCSAI PENSVK.EVL.RAPGTIKDRI KKLItAHKNSMK 
KKAKI KNVTPEPTRTPTPKVNIiQPFN YEEIVSRGGNSHGG\ KKG 
NEEKMKEGLEDEKREEKALKD*HRRERPFRG\DVFFPKVNEAGE 
FGIilL\VQRKAIiTSKLEHKADIjNISVDCSFNHG\lCDW\KQDR\ 
EDDFDW\NPADR \DNAT \GFY\MAVPGLWQGHK\KDIGRLKLLI» 
PDLQPQSl^CLI»FDYRIiAGDKVGKI*RVFVKNSNNALAWEKTTSE 
DEKKKTG KIQ LYQGTDATKS 1 I FEAERGKG KTGE IAVDGVLLVS 
GLCPDS L LS VDD 




2 


3387 


ACPRIiARRRRRVRSIjRRRRGWLRARWSRGQNKMAARRITQETFD 
AVLQEKAKRYHMDASGEAVSETLOFKAQDLLRAVPRSRAEMYDD 
VHSDGRYSLSGSVAHSRDAGRESLRSDVFSGPSFRSSNPS I SDD 
SYFRKECGRDLiEFSHSNSRDQVIGHRKIiGHFRSQDWKFALRGSW 

eqdfgh pvsqes s wsqe ys fg p savlgdfgssrl i e kecleke \ 
srdydvdhsg\ea\dsvlrgs\sqvqa\rgralnivdqegsllg 

. KGETQGIjI.TAKGG VG KLVTLRNV5T kki ptvnri TPKTQGTNQ I 
QKNTPS PDVTLGTNPGTEDIQFPIQKI PLGliDLKNLRIiPRRKMS 
FDIIDKS DVFSRFG I EI I KWAGFHT I KDD I KFS QLFQTItFELET 
ETCAKMLAS FKCSIiKPEHRDFCFFTIKFLKHSALKTPRVDNEFL 
NMI1LDKGAVK.TKNCFFEI I KPFDKY IMRLQDRLI.KS VTPLLMAC 
NAYELS VKMKTLSN P LDLALALSTTNS LCRKSLALLGQTFSLAS 
S FRQEKIL * AVG LQDIAPS PAAFPNFEDSTLFGREY IDHLKAWL 
VSSGCPIiQVKKAEPEPMREEEKMIPPTKPEIQAKAPSSLSDAVP 
QRADHRWGTIDQLVKRVI EGSIiSPKERTLLKEDPAYWFLS DEN 
SLEYKYYKLKIiAEMQRMSENLRGADQKPTSADCAVRAl'ILYSRAV 
RNIiKKKLLP \ WQRRGIiL»RAQG \ LRG\ WKARRA\TTGTQTTiLFLR 
APGLKHKGRQAPGIiS\QAKPSLPDRND\AAKD\CPLDPV\GPSP 
QDPSLEASGPSPKPAGVDISEAPQTSSPCPSADIDMKDNGRTAE 
KLARFVAQVG\PEIEQF\S I \ ENSTDNPDLWFL\HDQNSS\AFK 
FY\RKKVFELCPSICFTSSPHNI*\HTGGGDTt\GSQESPVDLME 
GEAEFEDEPPPREAELESPEVMPEBEDEDDEDGGEEAPA\PGRG 
GPS LEGST PADGLPGEA\ AEDDIj/ AI/SAPALFTGIjLQVTCFP FG 
RGFSSKSIiKVGMI PAP KR VCL I QEPKVHEP VRI AYDRPRGRPMS 
KKKKPKDLDFAQQKL\TDK\NLGFO\MLQKMGWKEGHGLGSIjGK 
GIR\SRSACTQQAAWGGSGWGLSPSTCSLPLGSFTAKMAYSWQL 
IFVF 


5875 


296 


1848 


IiAALGGLPIiWRLSRRGFREYLLGLSAPSALGGAMRSVSYVQRVA 
LEFSGS LFPHAI CI*GDVDNDTLNEL WGDTSGKVS VYKND0SR P 
WLTCSCC^MLTCVGVGDVCNKGKNLIiVAVSAEGWFHI*FDLTPAK 
VLDASGHHF/TIjIGF^QRPVFKQHIPANTKVMLISDIDGDGCREI, 
WGYTDRWRAFRWEELGEGPEHLTGQLVSLKKWMLEGQVDSLS 
VTLGPLGIiP ELM VSQ PGCAYAI I»I*CTWKKDTGS PPASEG PTDGS 
/SGDPS CPRRG AAPDI WP YPQQECLHSPNWQHQT\SHGTESSGS 
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SEQ 
ID 
NO: 



5876 



5877 



T87T 



5879 



5880 



Predicted "~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



nucleotide t ^ Segment ^ntaxning signal pep tide" 

location rft C = C y*^™*> D=Aspartic Acidf E& 

location Glutamac Acid, F=Phenylalanine, G=Glycine 

corresponding H^Histidine, I=Isoleucine, K^^sine, 
to first L=Leucine, M=Methionine, N=Asparagine 

ammo acid P=Proline, Q^Glu t amine / R=Argininf 

residue of S=Serine. ^Threonine, V=Valfne 

seauLo* ^Tryptophan, Y=Tyrosine, X=Unknown, -Stop 

sequence | Codon, /-possible nucleotide deletion, 
\~possible nucleotide inse rt ion ) 

Ul.FALCTLDGTLKi.MEEMEEADKLLWSVOVDHQLFALEKLDVTG 
NGHEEWACAWDGQTY 1 1 DHNRT WRFOVDEMTPn x?r*n v ^rr 



1122 



224 



2030 



1907 



950 



2113 



981 



1138 



1324 



5881 



26 



441 



5882 



2407~ 



2216 



| vj^p i i.UUTL»KJjM EEMEEADKL L»Wb V 0 V DHQLFALE KLD 
NGHEEWACAWDGQTY X 1 DHNRT WRFQVDENIRAFCAGLYACK 

egrnspclvyvtfnqkiyvywevqlermestnlvkllet:<p\st 

TACCRSWAWILTTSL*LVPCFTKRSTIQTSHHSVLPQASRIPPS 
WTCL I AGEG FF * TPTLP PKGVFGS HCAAAG S I TKQ 




v *«"""^&^«KVKiJKuviijVlh.UI J KQKASEYESEAKYr J Q 

DLLME S VNFS PANLS S TGSR YIiNALVDS AVALETKDTS IAS F I P 
AVNDLTSDLFRTKSKSEEIKIELEKLEKNLTATLVLEKCLOEDV 
KKAELH LS TER \ AKVDNRRQNM \ D FLKAKS EE FR FG I QAAG EQL 

sargqvdafsvpiqslvalirenwprlkqqtiplkXkklesyld 

LMP\NPSHCSK*RIEEAK\REIA\SIEAELT RRVS\MMEL 

GTLGKMAASSSGSKEKERLGGGLGVAG GI^STRERLLSALEDLEV 

LS REL I EMLA I S RNQ KLLQAGEENQ VLELL I HRDGEFQELM KLA 

LNQGK I HHEMQVLEKE VE KRDS D I QQLQKQLKEAEQ I LATAVYQ 

AKEKLKSIEKARKGAISSEEIIKYAHRISASNAVCAPLTWVPGD 

PRRPYPTDLEMRSGLLGQMNNPSTNGVNGHLPGDALA/RRKIAR 

CPCSTVS/NGSQMTCR*INIILILQKSVCEL 

GLWKCMQLQGPHTHRVQP* P TPKQQGPQ WPVAVIAGNRPNYLY 
RMLRSLLSAOt?VSPDMTTVPTnr!vvc-i?nM™r„«, 



^wi^ui^GPHTHRVQP* PTPRUQUPy WpvAVIAGNRPNYLY 
RMLRSLLSAQG VS PQMI T VFID G Y YEEPMD WAIi FGLRG I OHTP 
IS I KNAR VSQH YKASLTAT FNLFPEAKFAWLE EDLD I AVDFFS 
I FLSQS I HLLEEDDSLYCI SAWNDQG YEHTAEDPALLYRVETMPG 
! LGWVLRRSLYKEELEPKWPTPEKLWDWDMWMRMPEQRRGRECXI 
PDVSRSYHFG1VGLNMNGYFHEAYFKXHKFNTVPGVQLRNVDSL 
KKEAYEVEVHRI^SEAEVliDHSKNPCEDSFLPDTEGHTYVAFIR 
ME KDD D FTTWTQLAKCI»H I WDIJD VRGNHRGLWRLFRKKNHFLVV 
GVPASPYSVKKPPSVTPIFLEPPPKEE 5APGAPEQT 
I RLTEAAAAGSG5.KAAG WAGS PPTLLPLSPT3 PRCAATMASS DED 
[ GTNG GAS E AG E DRE APG KR RR LG PLATA WLT FYD I AMTAG W L VI> 
AI AMVRFYME KGTHRGL YKS I QKTLK F FQTPAIiLE I VHCL I G I V 
PTSVIVTGVQVSSRIFMVWLITHSIKPIQNEESWLFLVAWTVT 
| EITR YS F YTFSLLDHLP YFI KWAR YNFFI ILYP VGVAGELLT I Y 
AALPHVKKTGMFS IRLPNKYNVSFDYYYFLLITMASYIPLFPQL 
YFHMLRQRRKVLHG \G *L* KRMI K* SLQTRCFFQNNQDYLS PS F 
NNKNKQLCEISWIVWFLKI 

SLWCLVAGGIjGIjGPSSQNPLQRAGILARPKEARGTFSALTACSA 

! SVTSKGKS SSGMWPSAASDRDSPVPLRPPGPVQLPSGTGWVLSD 

♦KKKRGRCSS/WLSQPQHEREKEVVLLRRSMAEGERARAASDVL 
CRSLANETHOLRRTT.TaTauMr»oTiT 7i w/^t nrm A „. A 




^.vnviiWiASRTARDAALERVQMLEQQILAYKDDFMSERADRER 
AQSRIQELEEKVASLLHQVSWRQDSREPDAGRIHAGSKTAKYLA 

adalelmvpggwrpgtgsqqpeppaegghpgaaqrgqgdlqcph 

CLOCFSDEQGEELLRHVAECCQ 




^> n. v w^bAii VMVS C YVSG YTLTKLSMHWVRQAPG KGLE *MGP FD 

LQDVETIYPQKFQGRVSMTBETSTETTQ/AYLELSSLRSEDTAV 
HHCATDTV 



j SGCVEMLYSHSLEYNPEWISVQSAVAPAQI ALNSDGDL*LHSGE 
I RTR RD*QLP3AGGPGLQEPLQLGELDITSDEFILDEVDG\VDLR 
HYSKQVELELQQ I EQ KS IRD Y I QESENI AS LHNQ I TACDAVLER 
1 MEQMUSAFQSDl^SISSEIRTT^Rn.QriaMMToT t^tt,,™,^^,. « 



I ^vw^vv^Ai,VTAILEAPVTEPRFLEQLQELDAKAAAVREQE 
ARGTAACADVRGVLDRLRVKAVTKIREFILQKIYSFRKPMTNYQ 

r IPQTALLKYRFFYQFLLGNERATAKE1RDEYVETLSKIYLSYYR 
SYLGRLMKVQYEEVAEKDDLMGVEDTAKKGFFSKPSLRSRNTIF 
TLGTRGS VI S PTELEAP I LVPHTAQRGEQR YPFEALFRSQH YAL 

I LDNSCREYIiFICEFFVVSGPAAHDLFTlAVMGRTLSMTLKHLDSY 
IADCYDAIAVFIX!IHI VLRFRNIAAKRDVPAI.DR YWEQVLALLV? 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ~ 
(A^Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=Aeparagine , 
f-i'roxxne, y= Glut amine , R=Arginme, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan f Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PRFELILEMNVQSVRSTDPQRLGGLDTRPHYITRRYAEFSSALV 
S INQTI PNERTMQLnGQLQVEVENFVLRVAAEFSSRKEQLVFLI 
NN YDMM LG VLM \ E* ERAADDS KEVE S FQQLLNARTQEFI E ELLS 
PPFGGLVAFVKE AEAL1 ERGQAERLRGE EARVTQLI RGFGS S WK 
SSVESLSQDVMRSFTNFRNGTSI IQGALTQLIQ\LYHRFHRV\L 
SQPQLRAL PARAELI NIHHLM VELKKHKPNF 


5883 


2 


JL J IH. 


EFPGRRFRAVMEAGAGAGAGAAGWSCPGPGPTVTTLGSYEASEG 
CERKKGQRWGSLBRRGMQAMEGEVLLPALYEEEEEEEEEEEEVE 
EEEEQVQKGGSVGSLSVNKHRGLSLTETELEELRAQVLQLVAEL 
EETRELAGQHEDDSLSLQGLLSDERLASAQQAEVFTKQIQQLQG 
ELRSLREE XSLLEHEKESELKE I EQELHIjAQAE IQSLRQAAEDS 
ATEHESDIASLQEDLCRMQNELEDMERIRGDYEMEIASLRAEME 
MKSSEPSGSLGLSDYSGLQEELQELRERYHFLNEEYRALQESJTS 
SLTGQLADLESERTQRATERWLQSQTLSMTSAESQTSEMDFLEP 
D PEMQLLRQQLRDAE EQMHGMKNKCQEL CCELEELQHHRQVS EE 
EQRRLQRELKCAQNEVLRFQTSHS\SPSHPLPPIPPSSPCLL+A 
LWISALLWCWWAETSS 


5884 


4261 


2522 


GVLARAS ARLRVPLTGVRACAE PE VGAE PAKVAGAAE PDEDGGR "" 
SRLRDCGDYTPSERLGPKGAMLWFQGAIPAAIATAKRSGAVFVV 
FVAGDDEQSTQMAASWEDDKVTEASSNSFVAIKIDTKSEACLQF 
SQ I YP WCVPSS FFIGDSGI PLEVI AGSVSADELVTRIHKVRQM 
HLLXSETSVANGSQS ES S VST PS AS FE PNNTCENSQ5RNAEL CE 
I P STS DTKSDTATGGES AGHATSSQE PSGCSDQRPAEDLHI RVE 
RLTKKT »EERREEKRKEEEQREI KKE I ERRKTGKEMLDYKRKQEE 
ELTKRMLEERNREKAEDRAARERIKQQIALDRAERAARFAKTKB 
EVEAAKAAALLAKQAEMEVKRESYARERSTVARIQFRLPDGSSF 
TNQFPSDAPLEEARQFAAQXVGNTYGNFSLATMFPRREFTKEDY 
KKKLLDLELAPS AS WLLP / ALF INF * AGRPTAS I VHS SSGDI W 
TLLGTVLYPFLAI WRLI SNFLFSNPPPTQTSVRVTSSEPPNPAS 
SSKSEKREPVRKRVLEKRGDDFKKEGKIYRLRTQDDGEDENNTW 
NGNSTQQM 


5885 


900 


467 


AAGGGRRSRLSRSWPTGPSKSPSGVRCCG \RR \AWEDKDEFLDV " 
I YWFRQI IAWLGVI WGVLPLRGFLG IAGFCLINAGVLYLYFSN 
YLQIDEEEYGGTWELTKEGFMTSFA/ IVHGHLDHLLHCHPL*LM 
VYS SQVLP I QS KGPS 


588 6"" 


86 


1341 


P FRGRALTLKKQ PRPGYAP P S LGTCH KSDPGRPAAQSQPPS PGS 
GTFGLLS FRMVRTKTWTLKKHFVG YPTNSD FELKTS ELP PL KNG 
EVLLE ALFLT VDP YMRVAAKRLKEGDTMMGQQVAKVVE S KNVAIj 
PKGTIVIJu^PGWTTHSISIXSKDLEKLLTEWPDTIPLSIiAICTVG 
MPGLTAYFGLLE ICGVKGGETVMVNAAAGAVGSVVGQIAKLKGC 
KWGAVGSDEKVAYLQKLGFDWFNYKTVESLEETLKKASPDGY 
DCYFDNVGGEFSNTVIGQMKKFGR3AICGAISTYNRTGPLPPGP 
PPE IG IYQELRMEAFVVYRWQGDARQKALKIILLKWVLELPYFVI 
D*LQANTLVYKSMKSAKPSLEYISEKLVSG\KIQYKEYIIEGFE 
NMPAAFMGMLKGDNLGKT I VKA 


588*7 


1937 


104 


APGCRGCRATRCPCRGPRWDSLGDEAARSPAAPGGAPGLLGLRE' 
RPDRCHPGGDDRGPQLHRGSPG/SPSELSRRPGPPGLPGLQGPP 
PAPGLPQSRTL/ PVLCVCDLS PAQCDINCCCDPDCSSVDFSVFS 
ACS VPWTGDSQ FCS QKAV I YSLNFTANPPQRVFELVDQIN PS I 
FCIHITK \ * NLH YPLLIQKYL/NE^FDT*LMlCrSDGFTLNAES Y 
VS FTTKLD I PTAAKYEYGVPLQTSDSFLR F PSSLTS S LCTDNNP 
AAFLVNQAVKCTRKINLEQCEE I EALSMAFYSSPE I LRVPDSRK 
KVPITVQS I VIQSLNKTLTRREDTDVLQPTLVNAGHFSLCVNW 
LEVKYSLTYTDAGE VTKADLS FVLGTVSS VVVPLQQKFE IHFLQ 
ENTQPVPLSGNPGYVVGLPIiAAGFQPHKGSGIIQTTNRYGQLTI 
LHSTTEQDO^ALEGVRTPVLFGYTMQSGCKLRLTGALPCQLVAQ 
KVKSLLWGQGFPDYVAPFGNSQGP/ADMLDWVPIHFITQSFNRK 
DS CQLPGALVIEVKWTKYGSLLNPQAKI VNVTANLI S SS FPEAN 
S GNERTI L I S TAVTFVD VSAPAEAG FRAP PAINARLP FNFFFPF 
V 
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SEQ 
ID 
NO: 


Pr edi c ted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
{A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
ULeucine, M=Methionine # N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=p03sible nucleotide insertion) 


j 5888 


375 


2302 


LELHPDYKTWGPEQVCSFLRRGGFEBPVLLKNIRENEITGALLP 
CLDESRFENLGVSSLGERKKIiSYIQIU.VQIHVDTMKVINDPIH 
GHI ELHPLLVR I IDTPQ FQRLR Y I KQLGGGYYV FPGASHNR FEH 
SIiGVGYIiAGCLVHAIjGEKQPELQ IS SRDVLCVQ I AGLCHDLGHG 
PFSHMFDGRFIPIiARPEVKWTHEQGSVMMFEHLINSNGIKPVME 

qyglipeedicfikeqivgplespvedslwpykgrpenksflye 
ivsnkrngidvdkwdyfardchhlgiqnnfdykrfikfarvcev 
dnelricardkevgnlydmfhtrnslhrrayqhkvgni idtmit 
daflkaddyieitgaggkkyristaiddmeaytkltdnifleil 

YSTDPKLKDAREIIjKQIEYRNLFKYVGETQPTGQIKIKREDYES 
LPKEVASAKPKVLLDVKIiKABDFIVDVINMDYGMQEKNP IDHVS 

fycktapnrairitknqvsqllp\ekfaeq\lirvyckkvdrks 
lya\arqyfvqw\cadr\nft\kpqdgrcy*pptp*hpqkkgw\ 
ndstfspkiptrlprrlpksrv\qlfxddpm 


5883 


1831 


731 


lpaacgrpvtarprqapegrsgrprdldpyppqvfpprpdr vai " 
vtggtdgigystakhlarlgmhvi iagnndskakqwskikeet 

LNDKET*VLLCCPGWLCLWNSSDPPTSASRGAGTTGVHHHFLLK 
FG I F I L \ DLASMTS IRQF VQKFKMKK I ? IiHVL I NNAGVMMVPQR 
KTRDGFEEHFGLNYtiGHFLLTNLLLDTIiKESGS PGHS ARWTVS 
SATHYVAELNMDDLQSSACYSPHAAYAQSKLALVLFTYHLQRLL 
AABGSHVTAWVVDPGVVNTDLYKHVFWATRLAKKLLGWLLFKTP 
DEGAVJTSIYAAVTPELEGVGGRYLYMKKETKSLHVTYNQKIjQQO 
LWSKSCEMTGVLDVTL 


5890 

r 
t 

i 


1322 


200 


FRRGWSAAGRAVPVAFCSR 1 5ASSPRRPRGAVRLQSGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTTiVHLFAGGCGGTVGAI ltcp 
LE WKTRLQS S S VTL Y I S E VQLNTMAG AS VNRVVSPGPLHCLKV 
ILEKEGPRSLFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDS TQVHM I SAAMAGFTAI TATNP I WL I KTRLQL * /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYESI 
KQKLLE YKTASTMENDEES VKEASDFVGMMIiAAATS K\ LVATTI 
A YPHEWRTRLREEGTKYRS FFQTLSLLVQEEG YGS LYRGLTTH 
LVRQ I P \NTAIMMATYELVVYLLNG 


5891 


1322 


200 


FRRGWSAAGRAVPVAFCSR I S AS SPRRPRGAVRLQSGTEAACRS 
GRPD PRPAS AAGGHAGERMSQRDTLVHLFAGGCGGT VG A I T./TCP 
LEWKTRLQSSSVTLYISEVQLNTMAGASVNRWS PGPLHCLKV 
ILEKEGPRSDFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDSTQVHM I SAAMAGFTAI TATNP I WL I KTRLQL * /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSAS YAGI SETVIHFVXYES I 
KQKIiEYKTASTMENDEESVKEASDFVGMMLAAATSK\LVATTI 
AYPHE WRTRIjREEGTKYRS FFQTLSLL VQEEGYGS L YRGLTTH 
LVRQIP\NTAIMMATYELVVYLLNG 


5892 
5893 


1764 


379 


WLRVCGRLS VNSAVSS RTGGWS AGLTCAMQRLQWLGHLRGPA 
DSGWMPQAAP CLSG APHASAADVVVVHGRRTAI CRAGRGGFKDT 
TPDELLS AVMTAVLKDVNLRP EQLGDI C VGNVLQPGAGAIMAR I 
AQFLSD IPETVPLSTVNRQCSSGLQAVAS IAGGIRNGS YDIGMA 
CG VESMS LADRGN PGN I TSRLME KEKARDCL I PMG ITS ENVAER 
FG I SRE KQDT FALAS QQKAARAQ S KGCFQAE 1 VPVTTT VHDDKG 
TKRSITVTQDEGIRPSTTMEGIiAKLKPAFKKDGSTTAGNSSQVS 
DGAAAILLARRSKAEELGLPILGVLRSYAWGVPPDIMGIGPAY 
AI PVALQKAGLTVSDVD I FEINE\AFASQAAYCVEKLRLPP * EG 
1 -fAA* w*o i*ir w *uAjr JjHwvjHVQVX TLAQ * S * SARGKRAYRSG C 
PCAIGSWNGS PLPVFEYPWGT 




3 


1653 


ILSKRRCQKAKTKELMAKKVAVIGAGVSGLISLKCCVDEGLEPT " " 
CFERTEDIGGVWRFKENVEDGRAS I YQS WTNTSKEMS CFSDFP 
MPED FPN FLHNS KLLE YFRI FAKKFDLLKY I QFQTT VLS VR KCP 
DFSSSGQWKWTQSNGKEQSAVFDAVMVCSGHHILPHIPLKSFP 
GMERFKGQYFHSRQYKHPDGFEGKRILVIGMGNLGSDIAVELSK 
NAAQVFISTRHGTWVMSRISEDGYPWDSVFHTRFRSMLRNVLPR 
TAVKWMIEQQMNRWFNHENYGLEPQNKYIMKEPVLNDDVPSRLL 
CGAI KVKS TVKELTETS AI FEDGTVEENIDVI I FATGYS FSFPF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K^Lysine, 
L=I>enc.ine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) ! 








LEDSLVKVENNMVSIiYKY I FPAHLDKSTIiAC IGLI QPLGS IFPT 
AELQARWVTRVFKGLCSLPSERTMMMDI I KRNEKRIDLFGES QS 
QTLQTNYVDYLDELALEIGAKPDFCSLLFKDPKLAWLYFGPCN 
SY+YRLVGPGQWEGARNAIFTQKQRILKPLKTRALKDSSNFSVS 
FLLKILGLlrAVWAFF \ COLQWS 


5894 


174 


1673 


RYSPKKVLQNKESSbKIiGMATALVSAHSIjAPLNLKKEGIiRWRE 
DHYSTWEQGFKLQGNSKGLGQEPLCKQFRQLRYEETTGPREAI.S 
RLRELCQQWLQPETHTKEHILELLVLEQFL 1 1 LPKBLQARVQEK 
HPESREDVWVLEDLQLDLGETGQQVDPDQ PKKQKI LVEEMAPk 
KGVQECXJVRHECEVTKPEKEKGEETRI ENGKLI WTDS CGRVES 
SGKI SEPMEAHNEGSNLERHQAKPKEKIEYKCSEREQRF IQHLD 
LIEHASTHTGKKLCESDVCQSSSLTGHKKVLS * ERKVIQC\HGV 
LGKAFQRS SHLiVRHQK I HLGE KPYQCNE CGKVFSQNAGIiLEHIJR 
IHTGEKPYbCIHCGKNFRRSSHLNRHQRIHSQEEPCECKECGKT 
FSQALLLTHHQRIHSHS KSHQCNECGKAFS LTSDt, IRHHRIHTG 
E KP F KCN I CQKAFR LNSHLAQHVR I HNE E KP YQCSECGEAFRQR 
SGLFQHQRYHHKDKLA 


5895 


2967 


86 


HPS LLG A I PF YP PPSS P WPP PL YL FWNSHR KSRH FINQRG I HGE 
KRLFVSDGVPGCLP VLAAAGRARGRAEVL ISTVGPEDCWPFLT 
RPKVPVIjQLDSGNYLFSTSAICRYFFXixLSGWEQDDIiTNQWLiEW 
EATELQPTI*SAAl»YYL\VVOGKKG\ EDVLGSVRRTLTH IDHSLS 
RQ\WCPFIiAGETESIiADIVL»WGALYPIiIjQDPAYLiPEBLSALHSW 
FQTIiSTQXEPCQRXAARRIiVLKQXQGVLALRXPYLQKQPQPSPA 
EGKGLSP I EPESEELATLSEEE IAMAVTAWEKGLESLP PLRPQQ 
NPV1.PVAGERNVLITSALPYVKNVPHLGN I IGCVLSADVFARYS 
RLRQWNTL YLCGTDE YGTATETKAL\ EEG LT PQE I CDK YH I XHA 
DIY\RWFWISFDIFGRTTTPQQ\TKIT\QDIFQQDLKRGFVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYBEARGDQCDKCGK1,I 
NAVEL.KKPQCKVCRSCPWQSSQHLFLDLPKI.EKRIjEEWLGRTL 
PGSDWTPNAQF 1 TPFFGFREWPS KPRWQ *TRDL»K\ WGNPGTP* E 
G FEDK\ VFYVW FDAT IG YLS I TANYTDQ WERWW \ KNPEQVDLYQ 
FM \ AKDNVP FHS I*VFPS S ALGAEDN YTI>\ VSHLIATE YIJJ YEDG 
K\ FSKS RGVG VFRDM \AHDTG I P PD I SRFYL \liY I RPEG K\ DS A 
FSWTOLI^KNNS\EI,I^I^NFINPJV\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\irVTLELQIlYIIQ\LLEKVRIRDALRSILTIS\RH 
GNQYI \ QVNEPW\KR1 KGSEADRQRAGTVTG1AVNI AAIiLSVML 
QPYMP TVSAT I QAQLQLPPPACS ILLTNFIiCTLPAGHQIGTVSP 
LFQKLENDQIESLRQRFGGGQAKTSPKPAWETVTTAKPQQIQA 
LMDEVTKQGNI VRELKAQKADKNBVAA5VAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5896 


2967 


86 


HPSI*LGAIPFYPPPSSPWPPPI>YIiFWNSHRKSRHFINQRGIHGE 
mLFVSIXSVPGCLPVLAAAGRARGRAEVLISTVGPEDCVVPFLT 
RPKVPVIjQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATEbQPTLSAAIjYYL\WQGKKG\EDVLGSVRRTI,THIDHSLS 
RQNNCPFLAGETESLADIVliWGALYPLLQDPAYLPEELSAIiHSW 
FOTI*STQ\EPCQR\AARRIjVIjKQ\QGVIiAIiR\PYIjQKQPQPSPA 
BGKGIiSPIEPEEEELATLSEEEIAMAVTAWEKGIiESLPPLRPQQ 
KPVIjPVAGERNVLITSALPYVNNVPHLGNI igcvlsadvfarys 

rlrqwntlylcgtdeygtatetkal\eegltpqeicdkyhiiha 
diy\rwfnisfdifgrtttpqq\tkit\qdifqqllkrgfvlqd 
TVEQLRCEHCARF\ ladrfvegvcpfcg yeeargdqcdkcgkli 
navelkkpqckvcrscpwqssqhlfldlpklbkrleewlgrtl 
pgsdwtpnaqf itpffgfr ewp sk prwq * trdlk \ wgnpgtp * e 

GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 

fmXakdnvpfhslvfpssalgaedmytlKvshliateylnyedg 
K\FS KSRGVGVFRDM\AHDTG I PPDIS rfyi»\lyirpegk\dsa 
FSWTDLLLKNNS \EUjNNLGNF INRA\GMFVSKFFGG\ YVPEMV 

ltpddqrlla\hvtlelqhyhq\i*lekvrirdalrs ilti s\rh 
gnqyi \qvnepw\ krikgseadrqragtvtglavni aallsvml 

QPYMPTVSAT I QAQLQIjP PPACS I LLTNFLCTIjPAGHQI GTVS P 
LFQKLENDQI ESLRQRFGGGQAKTS PKPAWETVTTAKPQQIQA 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

co rre spond ing 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine , C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
T.=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=»Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\~possible nucleotide insertion) 








LMDEVTKQGNIVRELKAQKADKNEVAAEVAKliliDLKKQDAVAEG 
KPPEAPKGKKKK 


5897 


2967 


86 


HPSI.LGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVIiISTVGPEDCWPFIiT 
RPKVPVLQLDSGNYLFSTSAICRYFFMiLSGWEQDDIiTNQWLEW 
EATEI*QPTLSAALYYIi\WQGKKG\EDVLGSVRRTI>THIDHSLS 
RQ\NCPFIiAGETESLADIVLWGALYPLLQDPAYLPEELSALHSW 
FQTLSTQ \E PCQR \ AARRLVLKQ\ QGVLALR \ P YLQKQPQPS PA 

egkgls pie p eeeelatlsee e 1 amavtawe xgleslpp lr pqq 
npvlpvagernvlitsalpyvwnvphlgni i gcvls advfar ys 
rlrqwntlylc3gtdeygtatetkal\eegltpqeicdkyhiiiia 
di y\rwfnis fdi fgrtttpqq\tkit\qdifqqllkrgfvi*qd 
tveqlrcehcarf\ladrfvegvcpfcgyeeargbqcdkcgkiii 
navelkkpqckvcrs cpvvqs sqhlfldlpki.ekrleewlgrtl 
pgsdwtpnaqfitpffgfrew ps kprwq * trdlk\ wgn pgtp * e 
gfedk\vfyvwfdatigyi*sitawytdqwerww\knpeqvdlyq 
fm\akdnvpfhslvfpssalgaednyti,\vshliateylnyedg 
k\ fsksrgvgvfrdm\ahdtg i ppdisr fyl\ lyirpegk\dsa 
fswtdlli>knns\ellnnlgnfinra\gmfvskffgg\yvpemv 

LT PDDQRKLA\ HVTLEIiQHYHQ\ LbEKVR I RDALRS I LT I S \ RH 
GNQYI \QVNE PW\ KR I KGSEADRQRAGTVTGIiAVN IAALLSVMIi 
QP YMPTVSATIQAQLQI*PPPACS I LIVTN FLCTLPAGHQI GTVS P 
I»FQKI»ENDQ I ESLRQRFGGGQAKTS PKPAWETVTTAKPQQI QA 
LMDEVTKQGNIVRELKAQKADKNEVAAEVAKTJ.DLKKQLAVAEG 
KPPEAPKGKKKK 


5898 


2967 


86 


HPSoLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVJjISTVGPEDCVVPFLT 
RPKVPVLQIjDSGNYLFSTSAICRYFF\IjLSGWEQDDIjTNQWLEW 
E ATELQPTLSAALYYIj\ WQGKKG \ EDVLGS VRRTLTH 1 DHS IjS 
RQXNCPFIAGETESIiADIVLWGALYPLIiQDPAYIjPEELSALHSW 
FQTIiSTQ\EPCQR\AARRLVLKQ\QGVIiALR\pYLQKQPQPSPA 
EGKGLSP I EPEEEELATLSEEE I AMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVIiI TSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 
DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVEQLRCEHC^F\LADRFVEGVCPFGGYEEARGDQCDKCGKI,I 
NAVELKKPQCKVCRSCP WQS SQHI>FLDLPKLtEKRIjE EWLGRTI» 
PGSDWTPNAQFITPFFGFREW PS KPRWQ* TRDLK\WGNPGTP * E 
GFEDK\VTYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM\AKDNVPFHSLVFPSSALGAEDNYTti\VSHLIATEYLNYEDG 
K\ FSKSRGVGVFRDM\ AHDTGI P PD I SRFYL\IiYIRPEGK\DSA 
FSWTDLLLKNNS\ELIJrai/3NFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRIjLA\HVTLELQHYHQ\IjI>EKVRIRDALRSILTIS\RH 
" W U* a \yvw£.FW \lvi<XJV.GSEADRyRAGl V 1\3LAVNIAAIiLSVMIj 
QPYMPTVSATIQAQLQI*PPPACSILLTNFI*CTIiPAGKQIGTVSP 
XiFQKIiENDQI E SIiRQRFGGGQ AKTS P KPAVVETVTTAKPQQI QA 
LMDE VTKQGNI VRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


S899 


326 


1078 


KCPKSKEPNGVRAPSLPSPLRAAMALSDVDVKKQIKHMMAFIEQ ' 
EANEKAEBIDAKAEEEFNI EKGRLVQTQRLKIMEYYEKKEKQI E 
QQKKILMSTMRNQARLK^LRARNDLISDLIiSEAKLRLSRIVEDP 
EVYQGLLDKLVLQGIjLRLLEPVMIVRCRPXQDLLLVEAAVQKAI 
PEYMTISQKHVEV\QIDKEA*IAVECSWEVWEVYSGNQRIKVSN 
TLESRLDLSAKQKMPEIRMALFGANTNRKFFI . 


5900 


64 


1409 


KAAS RDS P CLE FCPLCG VS S HDLQHRMWYHRLSHIjHSRLQDLLK 
GGVIYPALPQPNFKSLLPLAVHWHHTASKSIiTCAWQQHEDHFEL 
KYANTVMRFD YVWLRDHCRSAS CYNS KTHQRSL-DTASVDLC I KP 
KTI RLDETTLFFTWPDGHVTKYDLNWLVKNS YEGQKQKVIQPR I 
LWNAEIYQQAQVPSVDCQSFIiETIfEGLKKFliQNFLLYGlAFVEN 
VPPTQEHTEKLAERISLIRETIYGRMWYFTSDFSRGDTAYTKIiA 
IJ)RHTDTTYFQEPCGIQVFHCLKHEGTGGRTLLVDGFYAAEQVL, 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N~Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y== Tyro sine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKAPEEFEIiL>SKSAI \kheyiedvgechqphdwdwaqs* isthg~~ 
/ykelyjbl rynnydravintvpydvvhrwytahrtlti elrrpe 
nefwvklkpgrvlfidnwrvlhgrecftgyrqlcgcyltrddvi* 
ntarllglqa 


5901 




2121 


VAIEQTSLKMMQAVGGAPARPTGEYICNQCGAKYTSLDSFQTHL, 
KTHLDTVL? KLTCPQ CNKE FPRQESLLKH VTI HFMITSTYY I CE 
SCDKQFTS VDDIiOKHXLDMHTFVFFRCTLCQEVFDS KVS IQLHL 
\AVKHSNEKKVYRCTSC^DFRNETDLQLHVKHNHLEN<JGKVHK 
CI FCGES FGTEVELQCHITTHS KKYNCKFCS KAFHAI ILLEKHI* 
REKHCVFETKTPNCGTMGASEQVQKEEVELQTLLTNSQESHNSH 
DGSEEDVDTSEPMYGCDlCGAAYTMETbLQNHQLiRDHNIRPGES 
AIVKKKAELIKGNYKCWCSRTFFSENGLRE^QTHUSPVKHYM 
C P I CGER FP SLIiTLTEH KVTHS KSLDTGN CR I CKMP LQS EEE Fti 
EHCQMK PDLRNS LTG FRCWCMQTVTSTLELKIHGTFHMQKTGN 
GSAVQTTGRGQHVQKLYKCAS CLKEFRS KQDLVKLDINGLP YGL 
CAGCVKLSKSASPGINVPPGTNRPGLGQNENLSAIEGKGKVGGI, 
KTRCS * LATF KF * VLKVEL PE PHP KP FHRGVS RPD SNS TQLKTP 
OVS PMPRIS PSQSDE KKTYQCI KCQMVFYNEWDIQVHVANHK ID 
EGLNHECKLCSQTFDS PAKI*QCHLI EHS FEGMGGTFKCPVCFTV 
FVQANKLQQH I FS AHGQEDK I YDCTQC PQKF F FQTELQNHTMTQ 
HSS 


5902 


712 


209 


LKNRRRSRPSIRQSIGSTSVSRWLTSLFTYliDHTADVQ*V*REF 
I PLXPRQ* ED * MFQSWLHAWGDTLEEAFEQCAMAMFGYMTDTGT 
VEPLQTVEVETQGDDLQS LLFHFLDEWLYKFSADEFF I P \GWGE 
EFSLSKHPQGTEVKA ITYSAMQVYNEEN PEVF VI IDI 


S903 


2106 


735 


DTPGPSLPSTTAPFSLRSLSFPSRPSYLLPGDPQPLQGRGLPTT 
PALFALSAVPGGAASPMPPSGLRLLPLLLPIiLWLIiVLTPGRPAA 
GLSTCKTI DMEUVKRKRI EAIRGQI LSKLRLAS PPSQGEVPPGP 
L P EAVLALYNS TRDRVAGESAEPEPE PEAD YYAKEVTRVLM VET 
HNEIYDKFKQSTHSIYMFFlSrrSELREAVPEPVl.LSRAEIjRLbRL 
KLKVEQHVEJLYQKYSNNS WRYLSNRLLAPSDS PEWLS FDVTGW 
RQWLSRGGE I EGFRLS AHCSCDSRDNTLQVDINGFTTGR\RGDL 
AT IHGMNRPFLkl^TPLERAQHLQS \SRHRQAL\DTNY\CFSF 
HGGRNCLRC / VHC*HLI FRKDL\GW\ KWI \HE\ PKGYHANFC\L 
G P CPY I WS LDTQ YSKVLAJb YNQ \ HKPG\ ASAAP \ CCVPQALEP \ 
LPIVYY\VGRK?KVEQLSNMIVRSCKCS 


5904 


3 


1126 


MMEE I ENAINTFKEEQRL I YEELI KEEKTTNWELSAISRKI DTW 
ALGNSETE KA FRAI S S KVP VDKVTP S TLPEE VLDFEKFLQQTGG 
RQGAWDDYDHQNFVKVRNKHKGKPTFMEEVLEHLPGKTQDEVQQ 
HEKWYQKFLALEERKKES I QI WKTKKQQKREEI FKLKEKADNTP 
VLFHNKQEDNQKQKEEQRKKQKIiAVE AWKKQKS I EMSMKCAS QL 
KEEEEKEKKHQKERQRQFKIiKLLLESYTQQKKEQEEFLRLEKEI 
REKAEKAEKRKNAADEISRFQERDLHKLELKILDRQAKEDEKSQ 
KQRRLAKLKEKVENNVS RDPS RIiY/ NTHQRLGRTNQKDRTNRLW 
ATS T YP T* G YSNLETRNTEKSMR 


5905 


287 


2912 


MAS FPPRVNEKE I VRLRTIGELLAPAAPFDKKCGRENWTVAFAP 
DGSYFAWSQGHRTVKLVPWSQtXQNFLLHGTKNVTNSSSLRLPR 
QNSDGGQKKKPREHIIDCGDIVWSLAFGSSVPEKQSRCVNIEWH 
RFRFGQDQLLLATGLNSGRI KI WDVYTGKLLLNIjVDHTG WRDL 
TFAPDGSLILVSASRDKTLRVWDLRDDGN\MMKVLRGHQNVJVY\ 
SCAFS PDSSMLCS VGAS KAWAAI LV * bRLCWHHSHTGATMVLS 
WAERVASI*ATGLGATFT I G * SNIiAFVLQG VLY VHRGWSMS TFCF 
SFFLFFFFKVISPTVKYH * LLSKL I FQFYGIGSLTSETNLM * S I 
WLSNGFSVLFFG ILSDSRDI I*RI** FNLKFVLI FF * K* CIVSVQK 
KKKPKR I ALLQEERLS*DKPPSSHLI * QTEVNIRIIiFRAI LHS * 
I*LIFRI*NCI*TYS*IIDPFYIQMTYDRG*FGKNKMVKF+FIEM 
* LY YFHKI AFS F CNW *HPCCLPKKFHLAVN I I#FACS ICFSS * A 
QVGDPSLL*TSDYLKGRCQWSNNLLTLRFLSVYFFKNLWSGKK 
REGGL* YLTLFISVYFS *LVFGINGFQYS FWKJLHCLYFMFRLI 
FKLTFNRNI *NRICMSALINLKTDFNLTMTLSIFFKLLI IYNA* 
YNLN* I +QF* YKMCHFVLCMSE*SYNI CI*FIAGF\LWNMDKYTM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide'" 
(A-Alanine, C-Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F- Phenyl alanine, G*=Glycine, 

L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, +=^Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








-RKLEGHHHDWACDFSPDGALLATASYDTRVYIWDPHNGDILM 
EFGHLFPPPTPI FAGGANDRW VRS VSFSHDGLHVAS LADDKMVR 
FWRI DEDYP VQVAPLSNGLCCAFSTDGS VLAAGTHDGS V YFWAT 
PRQVPSLQHIjCRMSIRRVMPTQEVQELPIPSKLLEFLSYRI 


5906 


146 


2038 


REGAGSGRMASGA\YNPYIEIIEQPRQRGMRFRYKCEGRSAGSI 
PGEHSTDNNRTYPS I QI MNYYGKGKV\RITLVTK\NDP YKPHPH 
DLVGKDCRD\GYYEAEFGQE\RRP\LFFQN\LGIRCVKKKEVKE 
A\ I ITR\ I KAG INPFDVP* KQLND I EDCD LDWRL W FR V FLPDG 

WYTW7.V 1* P P2V T.l)OU\ XTO O DT VriXTO* t>MtnM3T ntrom mrmnnnt»MNn 

nv>iN-u \ x xj^jjffv aIJWKAPNTAEIjRYCRVNKNCGSVRGG 
DE IFLLCDKVQKDDIEVRFVLNDWEAKGI FSQADVHRQVAI VFK 
TPP YC KAI TEP VTVKMQLRRPSDQE VS ES MDFR YLPDE KDTYGN 
KAKKQKTTLJLFQKLCQDHVETGFRHVDQDGLELLTSGDPPTLAS 
QS AG I TVN FPER PR PGLLGS I GEGR YFKKEPNL FSHDA WREM P 
TGVSSQAESYYPSPGPISSGLSHHASMAPLPSSSWSSVAHPTPR 
SGNTNPLSS FS TRTL PSNSQG I PP FTjRI PVGNDLNASNAC I YNN 
ADDIVGMEASSMPSADLYGISDPNMLSNC^VNMMTTSSDSMGET 
DNPRLLSMNLENPSCNSVIjDPRDLRQLHQMSSSSMSAGANSNTT 
VFVSQSDAFEGSDFSCADNSMINESGPSNSTNPNSHVFVQDSQY 
SGIGSMQNEQLSDSFPYEFFQV 


5907 


99 


1873 


TYLXiSSWSS * *NLDTKIKSQVKV/RKGHKKISWP YPQPAKQNGK 
KATSKVPSAPHFVHPNDHAMREAELKKKWVBEMREKQQAAREQE 
RQKRRTI ES YCQD VLRRQEEFEH KEEVLQE LNM F PQDDDEATRK 
A YYKE FRKWE YS DV I LEVLDARDPLG CRCFQMEEAVLRAQGNK 
KLVLVI^TKIDLVPKEVVEKWLDYLRNELPTVAFKASTQHQVKNIi 
NRCSVPVDQASESLLKS KACFGAKNloMRVLGNYCRLGEVRTHIR 
VGWGLPNVG KSSLINSLKRSRACS VGAVPGI TKFMQEVYLDKF 
IRLriDAPGIVPGPNSEVGTILRNCVHVQKLiADPVTPVETILQRC 
NLEEISNYYGVSGFQTTEHFLTAVAHRLGKKKKGGLYSQEQAAK 

t\ vuftUW V oUA, X£»v x XhrLJi^ALHlLiF XHIjSAEZ VKEMTEVFDIEDT 

EQANEin , MECIATGESDBLIX5DTDPLEMEIKLLHSPMTKIADAI 
EJNKTTVYKIGDLTGYCTIWPNRHQMGWAKRNVDHRPKSNSMVD 
SVDRRSVLQRIMETDPLQQGQALASAI*KNKKKMQKRADKIASKL 
SDSMMSALDLSGNADDGVGD 


590B 


247 


975 


HCG I KKRGEGSGSPS PASGGFQLGCQj. PEPSLPS EEETH PHTRA 
HTRTLRATLTORPPRSHSTRLRFPMPIJX3DGGLASWK/PMRER* 
GWRR PAKAAGAS LGVAATG KRGCRMS KRYLQKATKGKLL 1 1 1 FT 
VTLWGKVVSSANHHKAHHVKTGTCE VVAIjHRCCKKNKI EERS QT 
VKCS CFPGQVAGTTRAAPS CVDAS I VEQKWWCHMQPCIjEGEE CK 
VLPDRKGWS CSSGNKVKTTRVTH 


5909 


1 


5002 


PAI PGSTI I WAPGSHS AARADGRHGS I*PS QSQAPGALCGARAP P 
SSNLRADRSMI CAQARAGKNIiYHNRFIjGIiAAMAFPSRNSQS LRR 
C KEP I R YS YNPDQ FHNMDLRGGPHDG VTI PRSTSDTDLVTS DS R 
STl^RSSYYSIGHSQDLVIHVTOIKEEVDAGDWIGMYLIDEVLS 
ENFLD YKNRGVNGSHRGQI I WKI DAS S YFVEPETKI CFKYYHGV 
SGALRATTPSVTVKNSAAPIFKSIGADETVOGQGSRRLISFSLS 
DFQAMGLKKGMFFNPDPYLKI S IQPGKHS I FPALPHHGQERRSK 
I IGNTVNPIWQAEQFSFVSLPTDVLE IEVKDKFAKSRPI IKRFL 
GKI^MPVQRLLERHAIGDRVVSYTLGRRLPTDHVSGQLQFRFEI 
TS S IH PDDEE I SLS TEP ES AQ I QDSPMNNLMESGSGE PRSEAPE 
SSESWKPEQLGEGSVPDRPGNQS I ELSRP AEEAAV I TEAGDQGM 
VSVGPEGAGELLAQVQKDIQPAPSAEELAEQLDLGEEASALLLE 
DGEAPASTKEEPLEEEATTQSRAGREEEEKEQEEEGDVSTLEQG 
EGRLQLRAS VKRKSR PGSLP VS ELETVI AS ACGDPETPRTHYI R 
IHTLLHSMPSAQGGSAAEEEDGAEEESTLKDSSEKDGLSEVDTV 
AADPSALEEDREEPEGATPGTAHPGHSGGHFPSLANGAAQDGDT 
HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 
SSCYSASCYSPSCYNGNRFASHTRFSSVDSAKISESTVFSSQDD 
EEEENSAFESVPDSMQSPELDPESTNGAGPWQDEIAAPSGHVER 
S PEGLES PVAG PSNRREGECP I LHNSQP VSQLPS LRP EHHH YPT 
IDEP LPPNWE AR I DS HGRVFYVDHVNRTTTWQRPTAAATPDGMR 
RSGS IQQMEQLNRR YQN I QRT I ATERSEEDSGSQSCEQAPAGGG 
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BNSOOCID: <WO 01 S3312A1 J_> 



WO 01/53312 



PCT/US00/34263 



ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence • 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ■ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G==Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L~Leucine, M=Methionine, N=Asparagiiie, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T« Threonine, V=valine, 
W=Tryptophan, Y=Tyrosine, x=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GGGGSD3EAESSQSSLDLRREGSLSPVNSQKITLLLQSPAVKFI 
TNPEFFTVLHANYSAYRVFTSSTCLKHMILKVRRDARNFERYQH 
NRDLVNFINMFADTRLELPRGWE I KTDQQGKS FFVDHNSRATT F 
IDPRIPLQNGRLPNHLTHRQHLQRLRSYSAGEASEVSRNRGASL 
IARPGHSLVAAI RSQHQHESbPLAYNDKIVAFLRQPNI FEMLQE 
RQPSLARNHTLREKIHYIRTEGNHGLEKLSCDADLVILLSLFEE 
E1MSYVPLQAAFHPGYSFSPRCSPCSSPQNSPGLQRASARAPSP 
YRRDFEAKLRNFYRKLEAKGFGQGPGKIKLI I RRDHLLEGTFNQ 
VMAYSRKELQRNKLYVTFVGEEGLDYSGPSREFFFLLSQELFNP 
YYGLFEYSANDTYWQISPMSAFVENHLEWFRFSGRILG\LALI 
HQYLLDAFFT\RPFYKALL\RLPC\D\LSDLEYLDKEFHQSLQW 
MKDNNITDILDIiTFTVNEEVFGQVTERELKSGGANTQVTEKNKK 
EYIERMVKWRVERGWCQTEALVRGFYEVVDSRLVSVFDARELE 
LVIAGTAEIDLNDWRNNTEYRGGYHDGHLVIRWFWAAVERFNNE 
QRLRLLQFVTGTS S VPYEGFAAPPWE PMGLRR FLP * KKWGKITS 
LPPRG \ HTCLQPDWDLPTVS PRTPMLYEK\LLTA\VEETSTFGT 


5910 


1526 


446 


VAEFAAMEPGRTQI KLDPRYTADLDEVIjKTNYGI PSACFSQPPT " 
AAQLLRALGPVEUUuTSILTLLALGS I AI FLEDAVYLYKNTLCP 
IKRRTLLWKSSAPTWSVLCCFGLWIPRSLVLVEMTITSFYAVC 
FYLLMI,VMVEGFGGKEAVLRTI>RDTPMt4VHTGPCCCCCPCCPRL 
LLTRKKLQ \R+ CWALSNTPS + R* R* PWWACFSS PTASMTQQTFL 
RGACLYGSTLSSA/ CSTLLALWTLG 1 I SRQARLHLGEQNMGAKF 
ALFQVLL I LTALQPS I FS VLANGGQ I ACS PPYSSKTR3QVMNCH 
LLI LET FLMT VLTRM Y YRRKDI IKVG YETFS SPDLDLNLKALRWM 
AWTMKGCCTH 


5911 


109 


595 


Q^PLAPC I QGKGLEMRS PKPQS FURS SHSGAGLLV KNPSTP VF~~ 
CGHRRGGAAFKYKPTPWGPEQRPTGQKHMRGGVSLLSPRLECS 
GTISAHCNLRLPS SSNS PAPAS * LAG I TGVCHHAQLI FVFLVET 
GFHHVGQAGLELL/NWIHLPRPPKVLGLQA 


5912 




277 


M I LNKALMLGALALTTVMSPCGGEDI VADHVASYG VNLYQS YGP 
SGQYSHEFDGDEEFYVDLERKETVWQLPLFRRFRRFDPQFALTN 
I AVLKHNLN I VIKRSNS TAATNE VPEVTVFS KS P VTLGQPNTLI 
CLVDN I FP PWN ITWLS NGHSVTEGVS ETRPS SPKSDHFI .LQDQ 
VTSPSFPFE* *DL*TAKVEQLGAWFEPLLKHWGAEIPTTL 


5913 


46 


1198 


QLRMAGAEGAAGRQSELEPWSLVDVLEEDEELENEACAVLGGS 
DSEKCS YS QGS VKRQALYACSTCTPEGEE PAG I CLACS YECHGS 
H KLFELYTKRN FRCDCGNSKFKNLE CKLL PDKAKVNSGNKYNDN 
FFGLYCI CKRPYPDPEDEI PDEMIQCWCEDWFHGRHLGAI PPE 
SGDFQEKVCQACWKRCSFLV^YAAQLAVTKIST\GMMDWCGTLM 
E * /DDQEVI KPENGEHQDSTLKEDVPEQGKDDVREVKVEQNSEP 
CAGS S SE SDLQTVFKNESLNAESKSG CKLQELKAKQLI KKDTAT 
YWPLNWRSKLCTC^DCMKMYGDLDVLFXTDEYDTVLAYENKGKI 
AQATDRSDPLMDTLS SMNRVQQVEL I C/G IQ * FED 


5914 


960 


124 


NLGGSELPPEEALFIQVASMNQRRVDFYLAS IEDMLVAI/GGRN 
ENGALSSVETYSPKTDSWSYVAGLPRFTYGHAGTIYKDFVYISG 
GHDYQIGPYRKNLLCYDHRTDWEERRPMTTARGWHSMCSIiGDS 
I YS IGGSDDW I ESMERFDVLGVEAYSPQCNQWTRVAPLLHANSE 
SGVAVWEGRI YI LGG YS WENTAFSKTVQV YDREADKWSRGVDLP 
KAI AGGSACFI AP* SLGQRTRKRKAKARGTRTGASDPS CAS WDH 
PHRHLPGLCRPAATS 


5915 


1604 


703 


FPGRPTRPLKLGRRRKRARIIQAPHaiSPRPRTCPPGALQAPEA 
PASRAEG PVAVWNGHTEG PAPARS AP KEPPGL PRPLGS FP CPT 
PQEDFPALGGPCPPRMPPSPGFSAWLLKGTPPPPPPGLVP P IS 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP/PGLPS 
PREL PG E E PSAHP VHQGLPAERRGPLQRVQEPLRGVQTGPDLRS 
PVLQE LPG PAGGEFPEGL * * AAGPAAH 


. 5916 


256 


633 


S P RM WEI WGP WHRWES FSLEG EW PS R I PE PS PDS TKGTSG KGCR 
TVTGAVHRHLNHVAGI I PWVLHSQLKPTAATAQDQWTSQQYPDH 
PTRLI LQ *NQATADKNN* TTALLQPHQRL \ VS PRMAEA 


5917 


1343 


827 


AHQILTYLEP/ ICLWNYNKILTVFLTKSVLEI *KFIHTPQTYR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lyeine, 
L=Leucine, [^Methionine, N=*Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W tryptophan, Y-ryrosine, X=Unknown, *=Stop 
Codor., /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








3-*NDFFGlKEVYVSRRLRKTSF/RLAV?FLEQAWSKECVPVDQ " 
FMEHLL PS LLS LAS D PVPNVR VLLAKALRQMLLE KAY FRNAGNP 
HLEVI EET I LALQSDRDQDVS FFAALE PKRRNI I DTAVLEKQN 


5918 


13 


1247 


EGAQVARRRSRRQWRAGRCGRGRGGRRAERTGGRGPPGRPRPLP 
PGPARRGRRRMETPFYGDEALSGLGGGASGSGGTFASPGRLFPG 
A?PTAAAGSMMKKDALTLSLSEQVAAALKPAPAPASYPPA\ADG 
APS AAP PDGLLAS PDLGLXKLAS PELERL 1 1 QSNGL VT/TTPTS S 
QFLYPKVAASEEQEFAEGFVKALEDLKKQNQLGAGRAAAAAAAA 
AGG PSGTATGS AP PG ELAPAAAAP EAP V YA \NLS S Y \ AGGCRGL 

rggaatWafaaepvpfppppppgalgprrp/rlalqgrrpqtv 

PDVP\SFGESP\PLSP1ET\DTPRRI\KAKRKRL\RNPQIRAPK 
PASRKLGAQSRALERESEDPS+SPEHGSLASTASLLREQVAQLK 
QKVLSHVNSGCQLLPQHQVPAY 


5919 


1 


4254 


TSVQGDSQGTPTSSQGSINMEHWISQAIHGSTTSTTSSSSTQSG 
GSGAAHRIiADVMAQTHIENHSAPPDVTTYTSEHSIQVERPQGST 
GSRTAPKYGNAELMETGDGVPVSSRVSAKIQQLVNTLKRPKRPP 
LREFFVDDFEELLEVQQPDPNQPKPEGAQMLAMRGEQLGWTNW 
PPSLEAALQRWGTISPKAPCLTTMDTNGKPLYILTYGKLWTRSM 
KVAYSILHK1GTKQEPMVRPGDRVALVFPNNDPAAFMAAFYGCL 
LAEWPVP I E VPLTRKDAGSQQ I G FLLG S CG VTVALTS DACHKG 
LPKS PIGEI PQFKGWPKLLWFVTES KHLS KPPRDWF\ PHIKDAN 
NDTAYIEYKTCK\DGSVLGVTVTRTALLTHCQALTQACGYTEAE 
TI VNVLDFKKDVGLWHG I LTSVMNMMHVI S I P YS LMKVNPLS WI 
QKVCQYKAKVAC VKS RDMHWALVAHRDQRDI NLSSLRMLI VADG 
ANPWS ISSCDAFLNVFQS KGLRQEVI CPCASSPEALTVAIRRPT 
DDSNQ PPGRG VXiSMHGLT YG V I R VDS EEKLS VLT VQDVGLVM PG 
AIMCSVKPDGVPQLCRTDEIGELCVCAVATGTSYYGLSGMTKNT 
FEVFAMTSSGAPISEYPFIRTGLI^FVGPGGLVFWGKMDGLMV 

vsgrrhnaddivatalavepmkfvyrgriavfsvtvlhderivi 

VAEQR PDSTEEDS FQWMS R VLQAI DS IHQVGVYCLALVPANTLP 
KTPLGGI HI.S ETKQLFIiEG S LHP CNVLMCPHTCVTNLiPKPRQKQ 
PEIGPASVMVGNLVSGKRIAQASGRDLGQIEDNDQARKFLFLSE 
VLQ WRAQTTPDH I LYTLLNCRGA I ANS LTCVQLHKRAE K I AVML 
MBRGHLQDGDHVALVYPPGIDLIAAFYGCLYAGCVPITVRPPHP 
CNIATTLPT^KMIVEVSRSACLMTTQLICKLLRSREAAAAVDVR 
TWPLILDTDD * PKKRPAQI CKPCNPDTLAYLDFS VSTTGMLAGV 
KMSHAATSAFCRSI KLQCELYPSREVAI CLDPYCGLGFVLWCLC 
SVYSGHQSIL1PPSELETNPALWLLAVSQYKVRDTFCSYSVMEL 
CTKGLGSQTESLKARGLDLSRVRTCWVAEERPR IALTQS FSKL 
FKDLGLHPRAVSTSFGCRVNLAICLQGTSGPDPTTVYVDMRALR 
HDRVRLVERGSPHSLPLMESGKILPGVRIIIANPETKGPLGDSH 
LGE I WVHSAHNASG YFTI YGDESLQSDHFNSRLS FGDTQTI WAR 
TGYLGFLRRTELTDANGBRHDALYWGALDEAMELRGMRYHPID 
IETSVIRAHKSVTECAVFTWTNLLWWELDGSEQEALDLVPLV 
TNWLEEHYL I VGVWWD IGVI P INSRGEKQRMHLRDGFLADQ 
LDPIYVAYNM 


5920 


1381 


1499 


QLGAVAHAGVSRIPP+LFPPLHPTFLSIiWCLHHKLP/HPPGASM 
VRPPWPRRPPAHISSVRQASTQVPRTVPHTQRYANIGTQTTGP 
SGVGCCTPGRPLLPCKCSSAAHSTYRVQEPAVHIPGQEPLTASM 
LAAAPLHEQKQMIGERLYPLIHDVHTQLAGKITGMLLE1DNSEL 
LLMLES PESLHAKI DEAVAVLQAHQAMEQPKAYMH 


5921 
5922 


727 

2475 J- 


157 
495 


VL±t> *Ljl>Wt*UU>OLPKb 1 PLKf MUAi?"'l'(JiiyjjKRKFDDVDV 
G3SVSNSDDEIS SSDSADS CDSLNPPTTAS FTPTS ILKRQKQLR 
RKNVRFDQVTVYYFARRQGFTSVPSCGGSSLGMAQRHNSVRSYT 
LCEFAQEQEVNHREI LREHLKEEKXjHAKKMKLTKNGTVESVEAD 
GLTLDDVS0EDI DVENVE VDD YF FLQPL PTKRRRALLRASG VHR 
IDAEEKQELRAIRLSREECGCDCRLYCDPEACACSQAGIKCQVD 
RMSFPCGCSRDGCGNMAGRI EFNPIRVRTHYLHTIMKLELESKR 
Q\GAAQQPQ\ *GALPDCQLQPDRSTGL* DPSWIGSKGLSFTGKG 
AAATHLI ILRVTENRGAEGKRK 

SYSNWGLFPSVFIQVPRSRTGNLKPIFI»FYS YYE ^CMETLKG\T " 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=" 
Glutamic Acid, F-Phenylalanine , G=Glycine, 
H=:Histidine, I=Isoleucine, K=Lysine, 
I>= Leu cine, M=Meth.ionine, N=Asparagine, 
P=Proline, Q=Glutamine , R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X -Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CLYNATQY KVCS PRNDRPDACYN PSE PAATTVFE IRTGLLLGDT"" 
S KI I TRTEE KE I PKQ I TI*RFDACAAI NS K KLE I GCG S LN * ERS * 
RVENKYVCHESGVCKNCAYWPCVI * AT* KKNKNDSVYLQKGEAN 
PSCAAGHCNPLEI»1 1 TNPI/DPHWKKGERVTLG INRTGLKPQWI 
LlltbbVHKCi) PKP V t QTFY EEI^LPAPELLKKTKNLFLQIJVENV 
IFLLNGTSCYVRGGTTIGDRWPWEA*ELVPTDPAPDI IPI *KAE 
ASNF* VLKTSI IRQYCIAREGKDFI I PVGKPNCIGQKLYNSTTK 
TIT** DLNHTE KNPFS KFSKJjKTA* AHAE S H * DWTVP SG L Y * I C 
RHRAYFRLPNKWADSCVIGTIKPSFFLLPIKMGEIJ^FSVYASR 
EKKGI VIGNWKDNEWPRERI IQYYGPATWAQDGS WGYR/ TP / VY 
MLNWI IRLQAILE I ISNETGRALTVLAWQE TQM RNAI YQNRLAL 
DYLLVAEGGVCR KFNLTNCCLQ I NDQGQ WKNI VRDMTKLAHVP 
IQVWHKFDPESLFGKWFPAIGGFKTLIVGVLLVIRTCLLLPCVL 

PLLFQMI KG I VATLVHQKTSAHVNYMNHYRS ISQRDSKSEDESE 
NSH 


5923 


137 


638 


QLCGRRGQRFRTSIKRMHPI * RTCPNTNL/ 1 ILLSQENTQIRDL 
QQBNRELWI SLEEHQDALELIMSKYRKQMLQLMVAKKAVDAE PV 
UCAHQSHSAE I ESQIDRI CEMGEVMRKAVQVDDDQFCKIQEKLA 
QLELENKELRELLS ISS ESLQARKENSMDTASQAI K 


5924 


274 


2146 


EKGKVKDAGAEQWI SLSLSCKGSWETQFSNHLNSLTPPTS VRRM ~ 
PLITTVTLLKMVARHHKKLLCSKAFS TQLQQKI FLHSQMGI HHQ 
SVCMKLKPNTSHI ISILMGQPMALVQLETIoAPLTI I IQKFQTQD 
HMKFWKNLPLHSHHLTPSVPQTVI PKKTGSPE IKLKI TKTI QNG 
RELFESSLCGDLLNKVQASE\Q*NQSIESRKEKRKKSNKKDSSR 
S EERKSHK I P KLEPEEQNRPNER VDTVS E KLPREE P VLKEGS PS S 
ANTI FCSNNGSVHW \ FKFQVGDLVWSKVGTYFWWPCM VSSDPQL 
EVHTKINTRGAREYHVQFFSNQPERAWVHEKRVREYKGHKQYEE 
LLAEATKQASNHSEKQKIRKPRPQRERAQWDIGIAHAEKALKMT 
REERIEQYTFIYIDKQPEEALSQAKKSVASKTEVKKTRRPRSVIj 
NTQPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEPPPVKIAW 
KTAAARKSLPAS ITMHKGSLDLQKCNMS PWKI EQVFALQNATG 
DGKFI DQ FVYSTKG I GNKTE I SVRGQDRLI ISTPNQRNEKPTQS 
VSS PEATSGSTGS VEKKQQRRS I RTRS ESEKSTE WPKKKI KKE 
QVGFLHVES 


5925 


216 


1911 

i 


MMTAESREATGLS PQAAQEKDGI VIVKVEEEDEEDHMWGQDSTlT""" 
QDTPPPDPEIFRQRFRRFCYQNTFGPREALSRLKELCHQWLRPE 
INTKEQI bELiLVLEQ FLS ILP KELQVWLQE YRPDSGEEAVTLLE 
DLEDDLSGQQVPGQVHGPEMLARGMVPIiDPVQESSSFDLHHEAT 
QSHFKHSSRKPRIJjQSRALPAAHIPAPPHEGSPRDQAMASAIjFT 
ADSQAMVKI 3DMAVSLI LEEWGCQNLARRNLSRDNRQENYGSAF 
PQGGENRNENEESTSKAETSEDSASRGETTGRSQKEFGEKRDQE 
GKTGERQQKNPEEKTRKEKRDSGPAIGKDKKTITGERGPREKGK 
GLGRS FS liSSNFTTPEEVPTGTKSHRCDECGKCFTRSSSLIRHK 
I IHTGEKPYECSECGKAF\SLNS \NLVLHQRI \HTGE KPHECNE 
CGKAFSHS SNLI LHQR IHSGEKPYECNECGKAFSQS S D \ LTKHQ 
RIHTGEKPYECSECGKAFNRNSYLILHRRVHTREKPYKCTKCGK 
\AFTRSSTLTLHHRI HARERAS E YS PAS LDAFGAFLKS C V 


5926 


2 


233 


DRCLMLKQGSQPGSPPAT/CEPPAPPVYQAPCQSCPEPPGAIIEP " 
SDSPHHTPVHPPPEHSAACPAPATCCPPPRSSMS 


5927 


4146 


1248 


KHFS KFGSQALYQLKR PASGQNS I S VMPAQKI TKPAAKYG I PLA " 
YKKYGDKKI*HEKICPIiOKHKOAHOTPEKRVNTGEERRKmFFaa.T7 
KRRLEFIEKEKKQKDQI I SLMKAEQMKRQEKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGT IAPSSFSSRGQYEHYHAI FDQ 
MQQQRAEDNEAKWKREIYGRGIiPERQKGQLAVERAKQVEEFLQR 
KREAMQNKARAEGHMG I LQNLAAN YGGRPS SS RGGKPRNKE E EV 
YLARLRQI RLQNFNERQQ I KAKLRGE KKEANHS EGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEOLERKRKEAYEREKKVWEEHLV 
AKGVKSSD VS P PLGQHETGG SPSKQQMRS VIS VTSALKEVG VDS 
SLTDTRETS EEMQ KTNNA1 SSKR E I LRRLNENLKAQEDEKGKQN 
LSDTFE I NVHEDAKEHEKEKSVSSDRKKWEAGGQLVIPLDELTL 
DTS FSTTERHTVGEV I KLGPNGS PRRAWGKS PTDS VLKI LGEAE 
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SEQ 
ID 
NO: 



5928 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
correspond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



4146 



1248 



1558 



113 



6082 



Ammo acid segment containing signal peptide 
| IA=Alanme, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H-Hxstidine, I=Isoleucine, K=l>ysine, 
L=Leucine, M=Meth.i.onine, N-Asparagine, 
P=Proline, Q=Glutamine f R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
[ w=T ryptophan, y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insert ion) 
MjiA?'l*KIiIiENTTlRS^ 1 5 PEGE KY K PL I TG E KKV QC I S H E I N PS 
AIVDSPVETKSPEFSEASPQMSLKLEGNLEEPDDLETEIL.QEPS 
GTWKDE\SLPCTITDVWISEEKETKETQSADRITIQENEVSEDG 
VSSTVDQLSDIHIEPGTNDSOHSKCDVDKSVQPEPFFHIO/VHSE 
HLNL VPQVQS VQCS PEES FAFRS HSHLP P KNKNKNSIiL IGLSTG 
LFDANNPKMLRTCSLPDLSKLFRTLMDVPTVGDVRQDNLEIDEI 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEraSDNSD 
GEIASECECDSVFNHLEELRLHLEQENKSFEKFFEVYEKIKAIHE 
DED EM I E I CS K I VCNI LGN EHQHL YAKI LHL VMADG A YQEDNDE 
iUii:^KFGSQALYQLKRPASGQNSISVMPAQKI TKPAAKYGIPLA 
YKKYG D KKLH E KK PLQ KHKQ AHQTP E KRVNTG E ERRKI S EEAAR 
KRR LE F IEKE K KQKDQ IIS LMKAEQMKRQE KERLER I NRAR EQG 
WRNVLSAGGSGEVKAPFIjGSGGTIAPSSFSSRGQYEHYHAIFDQ 
MQQQRAEDNE AKW KRE I YGRGL PE RQKGQliAVERAKQVE E FLQR 
KREAMQNKARAEGHMGILQNLAAWYGGRPSSSRGGKPRNKEEEV 
YliARLR QI R LQ JJFNERQQ I KAKLRG EKKEANHS EGQEGS EEADM 
RRKK\ IESLKAHANARAAVI>KEOLERKRKEAYEREKKVWEEHLV 
AKGVKS S DVS P P LGQHBTGGS PS KQQMRS VI S VTSAL KE VG VDS 
SLTDTRETS EEMQKTNNA I SS KRE I LRR LN ENLKAQEDE KG KQN 
LSDTFE INVHEDAKEHE KEKS VS S DR KKWEAGG QL VI P LDE LTL 
DTS FSTTERHT VGE VI KLG PNGS PRRAWGKS PTDS VL»KI LG E A E 
1«QI*QTEI*LENTTI RSE I S PEG EKYKPL I TGE K K VQCI SHE I N PS 
AIVDS PVETKS PEFSEAS PQMSLKLEGNLEEPDDLETE ILQE PS 
GTNKDE \ SLPCT ITDVW ISEEKETKETQSADR I TI QENEVS EDG 
VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHKWIISE 
HLNLVPQVQSVQCSPEESFAFRSHSHLPPKNKNKNSLLIGLSTG 
LFDANNPKMLRTCSLPDLSKbFRTLMDVPTVGDVRQDNLEIDEI 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GEIASECECDSVFNHLEELRLHIjEQEMGFEKFFEVYEKIKAIHE 
D5PENIE ICS KI VQNI LGNEHQHEtYAKI LHLVMADGAYQEDNDE 

bDFSMTTULPAYVAILLFYVSRASCQ DTFTAAVYEHAAILPNAT 
LTPVSREEALALtyiNRNLDILEGAlTSAADQGAHI IVTPEDAIYG 
WNFNRDSLYPYLEDIPDPEVNI^IPCNNRNRFGQTPVQERLSCL\ 
AKNNS I Y WAN I GDKKPCDTSD PQCP PDGR YQYNTD WF\DS QG 
KLVARYHKQNLFMGENQFNVP KEP E I VTFNTTFGS FG I FTCFD I 
LFHDPAVTLVKDFHVDTIVFPTAW^INVLPHI J SAVEFHSAWA^IGM 
RVNFLASNIHYPSKKMTGSGIYAPNSSRAFHYDMKTEEGKLLLS 
QLDSHPSHS A WNWTS YASS I EALSSGNKE FKGTVFFDEFTFVK 
LTG VAGNYTVCQ KDLCCHLS YKMS EN I PNE VYALGAFDGLHTVE 
GRYYLQICTLLKCKTTNLNTCGDSAETASTRFEMFSI^SGTFGTQ 
YVF PEVLLS ENQLAPGE FQVSTDGRLFS LKPTSGP VLTVTLFG R 
IiYEKDWASNASSGLTAQARI IMLI VIAPI VCS LgW 
RGNCFW I V P FTMAQRTGLiEDPER YL FVDRA V I YNPATQ ADWTAK " 
KLVWIPSERHGFEAASIKEERGDEVMVELAENGKKAMVNKDDIQ 
KMNPPKFSKVEDMAELTCLKEASVLHNLKDRYYSGLIYTYSGLF 
CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQSILCTGESGAGKTENTKKVIQYLAHVASSHKGRKDHN 
. IPGEVLERQLLQANPILESFGNARTVQNDNSSRFGKFIRINFDV 
j TGYIVGANIETYLLEKSRAVRQAKDERTFHI FYQLLSG\AGEHL 
KSDLIiEGFKNYRFLSNGYIPIPGQXQDKGNFRGDPGEAr^HlMG 
FSHEEILSMLKVVSSVLQFGNISFKKERNTDQASMPENTVAQKL 
CHIjLGMNVMEFTPJ^ILTPRIKVGRDYVQKAQTKEQADFAVEAIiA 
KATYERLFRWLVHRINKALDRTKRQGASFIGILDIAGFEIFELN 
SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
DLQPCIDLI ERPANPPGVLALLDEECWFPKATDKTFVEKLVQEQ 
GSH5KFQKPRQLKDKADFCIIHYAGKVDYKADEWLMKNMDPLND 
NVATLLHQSSDRFVAELWKDVDRIVGLDQVTGMTETAFGSAYKT 
KKGMFRTVGQIi YKESLTKLMATIjRNTNPN FVRCI I PNHEKRAGK 
LDPHIiVIJX}LRCNGVLEGIRICRQGFPNRIVFQEFRQRYEILTP 
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SEQ 
IB 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine. X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








WAX PKGPMDGKQACERM I RALELDPNLYRIGQSK.I FFRAGVLAH 
LEHKRDLK ITDIII FFQAVCRGYLARKAFAKKQQQLSALKVJjQR 
NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEEELQAKDEEIiLKVK 
EKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETELFAEAEEM 
RARLAAKKQELEE I LHDLESRVEEEEERNQ I LQNE KKKMQAH I Q 
DLEEQLDEEEGARQKLQIoEKVTAEAKIKKMEEEIIiLLEDQNSKF 
I KEKKLMEDR I AECS SQIiAEEE E KA KNLAKI RNKQEVM I S DLEE 
RLKKEEKTRQELEKAKRKI*DGETTDLQDQIAELQAQIDEIJCLQL 
AKKEE ELQGALARGDDE TLHKNNAL XWRELQAQ I AELQEDFES 
EKASRNKAEKQKRDLSEELEALKTE1.EDTLDTTAAQQELRTKRE 
CEVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 
FKANLE KNKQG LETDNKELACEVKVIiQQVKAES EHKRKKLDAQV 
QELHAKVSEGDRLRVEIiAEKAS KLQNEIiDN VS TIiLEEAE KKG I K 
FAKDAASIiESQLQDTQEIiLQEETRQKLNLSSRIRQLEEEKNSLQ . 
EQQEEEEEARKNLEKQVLALQSQIADTKKKVDDDLGTIESLEEA 
KKKLLKDAEALSQRLEEKALAYDKLEXTKNRLQQELDDLTVDLD 
HQRQVASNI>EKKQ\KKFDQLLAEEKS I S ARYAEERDRAEAEARE 
KETKALS LARALEEALE AKE EFERQNKQLRADMEDLMS S KDDVG 

NMQAMKAQFERDLQTRDEQNEEKKRLLIKQVRE1>EAELEDERKQ 
RALAVAS KKKME I DLKDLE AQI EAANKARDEVI KQLRKLQAQMK 
DYQRELEEARASRDEIFAQS KESEKICLKSLEAE I LQLQEELASS 
ERARRHAEQERDELADEITNSASGKSALLDEKRRLEARIAQLEE 
ELEEEQSNMELLNDRFRKTTLQVDTLNAELAAERSAAQKSDNAR 
QQLERQNKELKAKLQELEGAVKSKFKATISALEAKIGQLEEQ1.E 
QEAKERAAANKXVRRTEKKLKEI FMQVEDERRHADQ YKE QME KA 
NARMKQLKRQLEEAEEEATRANASRRKIiQRELDDATEANEGIiSR 
EVSTLKNRI*RRGGPISFSSSRSGRRQLHI*EGASL,EIiSDDDTESK 
TSDVNETQPPQSE 


5931 


113 


6082 


RGNCF W I VPFTMAQRTGLED PER YL F VDRAVI YN PATQADWTAK 
KLVWI PS ERHGFEAAS I KEERGDEWfVELAENGKKAMVNKDD I Q 
KMNPPKFSKVEDMAELTCLWEASVLKNLKDRYYSGLIYTYSGLF 
CVVINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQS I L CTGESGAGKTENTKKVI Q YLAHVAS SHKGRKDHN 
IPGE\LERQLLQANP I I>ES FGNARTV QNDNS SRFGKFI RINFDV 
TGYIVGAN I ETYLLEKSRAVRQAKDERTFH I FYQLI>SG\AGEHL 
KSDLtJbEGFNN YRFI»SNGYI P I PGCAqDKGNFRGDPGEAMHIMG 
FSHEEXLSMLKVVSSVLQFGNISFKKERtJTDQASMPENTVAQKL 
CHLLGMNVMEFTRAI L.TP R I KVG RD YVQ KAQTKEQAD FAVEALA 
KATYERLFRWLVHR I NKALDRTKRQGAS FIG ILD I AGFEIFELN 
SFEQEjCIITYTNEKIiQQIjFNHTMFILEQEEYQREGIEWNFIDFGI* 
DLQPCIDL 1 ER PANP PGVLALLDEECWFPKATDKTFVEKLVQEQ 

gshskfqkprqlkdkadfci ihyagkvdykadewlmknmdplnd 
nvatllhqssdrfvaelwkdvdrivgldqvtgmtetafgsaykt 

KKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCI I PNHEKRAGK 
LDPHLVLDQLRCNGVLEGIRI CRQGFPNR I VFQEFRQRYEILTP 
NAIPKGFMDGKQACERMIRALBLDPNLYRIGQSKIFFRAGVLAH 
LEEERDLKITDIIIFFQAVCRGYLARKAFAKKQQQLSALKVLQR 
NCAAYI,KLRHWQWWRVFTKVKPI.IiQVTRQEEELQAKDEEIiLKVK 
EKQTKVEGELEEMERKIIQQLLEEKN I LAEQLQAETELFAEAEEM 
RARLAAKKQELEEILHDLESRVEEEEERNOILQNEKKKMQAHIQ 
DLEEQiiDEEEGARQKLQIiEKVTAEAKI KKMEEE I LLLBDQNSKF 
IKEKKLMEDRIAECSSQIiAEEEEKAKNLAKIRNKQBVMISDLEE 
RLKKEEXTRQEL»EKAKRKLDGETTDLQDQIAELQAQIDELKLQL 
AKKEEELQGAIJ\RGDDETIiHKlINALKVVRELQAQI AELQEDFES 
EKAS RNKAEKQ KRDLSEELEALKTELEDTIjDTTAAQQELRTKRE 
QE VAELKKALE EETKNHEAQ IQDMRQRHATALE ELSEQLEQAKR 
FKANLEKNKG^LETDNKELACEVKVLQQVKAESEHKRKKIiDAQV 
QELHAKVSEGDRI^VELAEKASKLQNEIjDNVSTIiLEEAEKKGIK 
FAKDAASIfESQLQDTQELIiQEETRQKliNIiSSR I RQLEEEKNSLQ 
EO^BEEEEARKNIiEKQVIJ^SQIJuTTKKKVDDDLGTIESLEEA 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide — 
(A^Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M=Mcthionane, ISUAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KKKLLKDAEALSQRLEEKALAYDKLEKTKNRLQQELDDLTVDLD 
HQRQVASNLEKKQ\KKFDQLLAEEKSISARYAEERDRAEAEARE 
KBTKALS LARALEEALEAKEEFERQNKQLRADMEDLMSS KDDVG 
KNVHELEKSKRALEQQV\EEMRTQDEELEDELQATEDAKLRLEV 
NMQAMKAOFERDLQTRDEQNEEKKRLLIKQVREI.EAELEDERKQ 
RAIiAVASKKKME IDLKDLEAQI EAANKARDEVIKQLRKLQAQMK 
DYQREJ^EARASRDEIFAQSKESEKKLKSLEAEILQLQEELASS 
ERARRHAEQERDELADE 1 TNS AS G KS ALLDEKRRLEARI AQLEE 
ELEEEQSNMELLNDRFRKTTLQVDTI^NAELAAERSAAQKSDNAR 
QQLERQNKELKAKLQELEGAVKSKFKATISALEAKIGQIiEEQLE 
OEAKERAAANKIiVRRTEKKLKEI FMQVEDERRHADQYKEQMEKA 
KARMKQLKRQLEEAEEEATRANASRRKLQRELDDATEANEGLSR 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5932 


33 


572 


RHLEEICFLFLQKGRKLKLSGPRWEEGKPRGTGGLWVKAEANMG 
FGATLAVGLT I FVLSWTI I ICFTCSCCCLYKTCRRPRPV\APP 
PHPP/PWHAPYPQPPSVPPSYPGPSYQGYHTMPPQPGMPAAPY 
PMQYPPPYPAQPMGPPAYHETLAGGAAAPYPASOPPYNPAYMDA 
PKAAL 


5933 


1 


3190 


GTRKLKMADKTPGGSQKASSKTRSSDVHSSGSSDAHMDASGPSD ~~ 

SDMPSRTRPKSPRKHNYRNESARESLCDSPMQNLSRPLLENKLK 

AFS IGKMSTAKRTLS KKEQEELKKKEDEKAAAE I YEE FLAAFEG 

SDGNKVKTFVRGGWNAAKEEKETDEKRGK1YKPSSRFADQKNP 

PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 

QEERDERHKTKGRI^SRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 

DDYAPGSHDVGD PSTT \NF YLGN I \NPQMNLKKCCCQE FGRFGP 

LASVKIMWPRTDEERARERNCGFVAFMNRRDAERAJLKNJUNGKMI 

MSFEMKIiGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 

RERLKNPNAPMI>PPPKNKEDFEKTLSQAIVKWIPTERNLLALI 

HRMI EFVVREGPMFEAMI MNRE INNPMFRFLFENQrPAHVYYRW 

FCLYSILQGDSPTKWRTBDFRMFKNGSFWRPPPLNPYLHGMSEEQ 

ETEAF VEE P S KKGALKEEQRDKLEE 1 1* RGLTPR KNDIGDAM VFC 

LNNAEAAEEI VDCITES LS I LKTPLPKKIARL YLVSDVLYNSSA 

KVANASYYRKFFETKLCQIFSDLNATYRTIQGHLQSENFKQRVM 

TCFRAWEDWAI YPEPFL UCLQNIFLGLVNI IEEKETED VPDDLD 

GAP I E EELDGAPIiEDVDG I P IDAT P I DDLDGVP I KSLDDDLDGV 

PLDATEDS KKNE P I FKVAPS KWEAVDE S ELEAQAVTTSKWELFD 

QHEESEEEENQNQEEESEDEEDTQSSKSEEHKLYSNPIKEEMTE 

S KFS KYS EMSEEKRAKLRE I ELKVMKFQDELESGKRPKKPGQS F 

QEQVBHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 

DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 

SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 

KKSGKKS RSQSRS PHRSHKKSKGKTNTGRKFFKKAVT YWKCDLF 

LCPERSVF 


5934 


1 


3190 


GTRKLKMADKTPGGS QKAS S KTRSS DVHS SGSSDAHMDASG PS D " 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPIJuENKLK 
AFSIGKMSTAKRTLSKKEQEELKKKBDEKAAAEIYEEFLAAFEG 
SDGNKVKTFVRGGWNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSUjVIETKKPPLKKGEKEKKKSNL»ELFKEEI*KQI 
QEERDERHKTKGRLS RFE P PQSDSDGQRR S MDAPSRRNRSSG VL 
uuiMfLj&riiJVLaUFo 1 1 \Nr YJjGNI \NPQMNLKKCCCQEFGRFGP 
LASVKIMWPRTDEERARERNCGFVAFMNRRDAERALKNLNGKMI 
MS FEM KLGWGKAVP I P PHP I Y I P PSMMEHTLPP PPSGI*P FNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAI VKWI PTERNLLALI 
HRMIEFVVREGPMFEAMIMNREINNPMFRFLFENQTPAHVYYRW 
KLYSILQGDSPTKWRTEDFRMFKNGSFWRPPPLNPYLHGMSEEQ 
ETEAFVEEPSKKGALKEEQRDKLEEILRGLTPRKNOIGDAMVFC 
LNNAEAAEEIVDCITESLSILKTPLPKKIARLYLVSDVLYNSSA 
KVANAS YYRKFFETKLCQ I FS DLNAT YRTI QGHLQSENFKQRVM 
TCFRAWEDWAI YPEPFLIKLQNI FLGLVNI IBEKETEDVPDDLD 
GAP I EEELDGAPXiEDVDG X P IDATPI DDLJ3GVPI KSMDDLDGV 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspond! ng 
to first 
amino acid. 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor r e spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Mefchionine, N=»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\ -possible nucleotide insertion) 








PLDATEDSKKNEP I FKVAPSKWBAVDESELEAQAVTTS KWELFD 
QHE ES EEEENQNQEE ESEDEEDTQS SKSEEHHLYSNP I KEEMTE 
SKFSKYSEMSEEKRAKLREIELKVMKFQDELESGKRPKKPGQSF 
QEQVEHYRDKIiIjQRE KE KELERERERDKKDKEKLE S RS KDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


593S 


3 


4493 

• 


SYWLSGWRLSRPPRQFWAGWRGIGRFGTMAPVHGDDCE IGASAL 
SDSG S FVS SRARREKKS KKGRQEALER LKKAKAGERYK YEVEDF 
TGVYEEVDEEQ YSKLVQARQDDDW I VDDDGIGYVEDGRE I FDDD 
LEDDAI^ADEKGKIXJKARNKDKRl^KKIiAVTKPNNIKSMFIACA 
GKKTADKAVDliS KDGLLGDILQDLNTETPQITPPPVWI LKKKRS 
IGASPNPFSVHTATAVPSGKIASPVSRKEPPLTPVPI*KRAEFAG 
DDVQVESTEEEQESGAMEFEDGDFDEPMEVEEVDLEPMAAKAWD 
KESEPAEEVKQSADSGKGTVSYLGSFLPDVSCWDIDQEGDSSFS 
VQE VQVDS SHL PLVKGADEEQVFHF YWLDAYEDQ YNQ PGWFLF 
GKVWI ESAETH VS CCVMVKNIERTLYFIiPREMKIDLNTGKETGT 
P I SMKDVYEE FDEKIATKYKIMKFKS KPVEKNYAFEI PDVPEKS 
E YIiEVKYSAEMPQLP QDbKGETFSHVFGTNTS SIjELF1»MNRKI K 
GPCWLEVKKSTAIiNQPVSWCKVEAMALKPDLVWVIKEVSPPPLV 
VMAFS MKTMQN AKNHQNE 1 1 AMAALVHHSFALDKAAPKPPFQSH 
FCWSKPKDC I FP YAFKEVI EKKNVKVEVAATERTLLG FFLAKV 
HKIDPDI IVGHN I YGFELEVLLQRINVCKAPHWSKIGRLKR5NM 
P KLGGRSGFGERNATCGRM I CD VE I SAKELIRCKS YHLSELVQQ 
ILKTERWIPMENIQNMYSESSQLLYLIjEHTWKDA\KFII*QIMC 
ELNVLPLALQ I TNI AGNIMSRTLMGGRSERNBFLLLHAFYENNY 
IVPDKQIFRKPQQKLGDEDEEIEKSDTNKYKKGRKKGAYAGGLVL. 
DPKVGFYDKFI IjIiliDFNSlrYPSI IQEFNICFTTVQRVASEAQKV 
TEDGEQEQIPELPD PS IiEMG I LPREI RKLVERRKQ VKQLMKQQD 
LNPDLI LQYD IRQKALKLTANSMYGCLGFS YSRFYAKPLAALVT 
YKGRE IIjMHTKEMVQKMNI>EVI YGDTDS IMINTNSTNLEEVFKL 
GNKVKSEVNKLYKLLE IDIDGVFKSLLXjLKKKKYAALWEPTSD 
GNYVTKQELKGIjDIVRRDWCDIjAKDTGNFVIGQILSDQSRDTIV 
ENIQKRIiIEIGENVLNGSVPVSQFEIHKALTKDPQDYPDKKSLP 
HVHVALWINSQGGRKVKAGDTVSYVICQDGSNLTASQRAYAPEQ 
IiQKQDNliTIDTQYYIiAQQIHPWARI CEPIDGI DAVLIATGWEI* 
\DPTQFKVHHYHKDEENDALI^GPAQLTDEEKYRDCERFKCPCP 
TCGTEN I YDNVFDGSGTDME PSIiYRCSWIDCKAS PLTFTVQIjSN 
KLIMD I RRFI KKY YDGWI*I CEEPTCRNRTRHLPLQFSRTGP I»CP 
ACMKATLQPEYS DKS LYTQLCF YRY I FDAECALEKLTTDHEKDK 
LKKQF FT P KVLQD YRKLKNTAEQFLSRSG YSEVNLS KliFAGCAV 
K3 


5936 


1124 


139 


RGEEQFDAEFRRF ACLGFGERLQEFSRLLRAVHRS RAWTCY LAI 
RMLMATCCPSPTTTACTGPWQRAPPLRLLVQKREADSSGLAFAS 
NSLQRRKKGLLIiRP VAPLRTR PPLLI S L PQDFRQVS SVIDVDLL 
PETHRRVRLHKHGSDRPLG FY IRDGMS VRVAPQG \ LER VPG I FI 
SRLVRGGLAESTGLLAVSDEI LEVNGI EVAGKTLNQVTDMMVAN 
SHN\LIVTVKPANQRNNWRGASGRLTGPPSAGPG PAE PDSDDD 
SSDLVIENRQPPSSNGLSQGPPCWDIiHPGCRHPGTRSSLPSLDD 
QEQASSGWGSRIRGDGSGFSL 


5937 


31 


1600 


** PTS LLKSTVQLMCRliLQDKRY QCV YSLAE I FKVLAS F Y V I LV I L 
YGLTSSYSLVWMLRSSLKQYSFEALREKSNYSDIPDVKNDFAFI 
LHLADQYDPI»YSKRFS I FLSEVSENKLKQINLNNEWTVEKLKS K 
LVKNAQDKI ELHIjFMIiNGLPDNVFELTEMEVIiS LEL I PEVKLPS 
AVSQLWLKEL.RVYHSS LWDHPALAFLEENIiKILRLKFTEMGK 
IPRWVFHLKNLKELYIiSGCVIiPEQLSTMQLEGFQDLKNLRTLYL 
KSSI»SRIPQVVTDLIjPSLQKLSLDNEGSKLVViiNNLKKMVNLKS 
LELiI SCDLER I PHS I FSLNN t»HELDLRENNI»KT VEE IIS FQHLQ 
NLSCLKLWHNNIAYIPAQIGALSNLEQLSLDHNNIENLPLQLFL 
CTKI*HYLDI>SYNHIjTFIPEEIQYI*\SNI*QYFAVTNNNI emlpdg 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
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corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
trt-Midnme, L-Lysceine , D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


"5938 






LFQCKKLQCLLLGKNSLMNXiSPHVGELSNLTHREPIG " 
P P ELEGCQS LKRNCL I VE ENLLNTL PL PVTERLQTCLD KC I 




395 


1865 


YKGEGFFCNQEARGERRKKKKAMSSPN I WSTGSSVYSTPVFSQK 
vrviijjjAjjjoijxtAjr l^yj^UUI^xEDYASNKTWVLTPKVPEGDV 
TVILNNLLEGYDNKLRPDIGVKPTLIHTDMYVNSIGPVNAINME 
YTIDI FFAQTW YDRRLKFNSTIKVLRLNSKMVGKIWI PDTFFRN 
SKKADAHWlTTPN^JLRIWNDGRVLYSIiRLTIDAECQI,QLHNFP 
MDEHSCPLEFSSYGYPREEIVYQWKRSSVEVGDrRSWRLYQFSF 
VGLRNTTE WKTTSGDYWMSVY FDLSRRMGYFTIQT Y I PCTLI 
WLS W VS FWI N KDAVPARTS LG I TTVLTMTTLST I ARKS L P KVS 
YVTAMDLFVS VCFI FVFSALVE YG \TLTI YFVSNRKPS KDKDKKK 
KNPAP TID I RPR SAT I QMNNATHLQERD EE YG YECLDGKDCAS F 
FCCFEDCRTGAWRHGRIHIRIAKMDSYARIFFPTAFCLFNLVYW 
VSYLYL 


5933 


66 


1404 


IRPGYLKEVQENSPGH RAGLEP FFDFI VS INGSRLNKDNDTLKD 
LLKANVEKPVKMLI YS S KTLELRETS VTPSNLWGGQGLLGVS I R 
FCS FDGANENVWHVLEVESNS PAALAGLRPHSDYI IGADTVMNE 
S BDLi FS LIE THE AKPLKL YV YNTDTDNCRE V 1 1 TPNSAWGGEGS 
LGCG IG YGYLIIRI PTR PFEEGKKISLPGQMAGTPI TPLKDGFTE 
VQLS SVNPPS LS PPGTTGI EQSLTGLS ISS TP \ PAVSSVLSTGV 
PTVP \ LLP PQVNQSLTS VP PMES S YLHLPG LMP FTRQGL PNLPQ 
PSTFNLPR\PTHSWPGVGLYQEFVKPGVLPPLSSMPPRNLPG\I 
APLPLPSEFLPSFPLVPESSSAASSGELLSSLPPTSNAPSDPAT 
TTAKADAAS S LT VDVTP PTAKAPTT VEDRVGDS TP VS EKP VS AA 
VD ANAS ESP 


5940 


145 


717 


RRSASRSASPRQSAGTAVTTGTRAGGTCLAAAHHRMRWRADGRS 
LEKLPVHMGLVI TEVEQE PS FSDI ASLWWCMAVG IS YI SVYDH 
QGI FKRNNSRLMDEILKOQQELLGLDCSKYSPEFANSNDKDDQV 
LNCHLAVKVLSPEDGKADIVRAAQDFCQLVAOKQKRPTDLDVDT 
LA\ VYLVQMWLI LI 


5941 


13 


6147 


MCLGRMGASS PRS PEPVGPPAPGLP FCCGGSLLAVWLLALPVA 
WGQCNAPE W\ L P FARPTNLTD EFEFP IGT YLN YECRPG YS GR P F 
S I ICLKNS VWTGAKDRCRRKS CRNPPDP VNGMVHV I KG I Q FGSQ 
IKYSCTKGYRLIGSSSATCI ISGDTVI WDNETP I CDRI PCGLPP 
TITNGDFI STNREWFH YGS VVTYRCNPGSGGRKVFELVGEPS I Y 
CTSNDDQVGI WSGPAPQCI I PNKCTP PNVENGI LVS DNRSLFS L 
NEWE FRCQPGFVMKGPRRVKCQALNKWE PELPS CSR VCQPPPD 
VLHAERTQRDKDNFSPGQEVFYSCEPGYDLRGAASMRCTPQGDW 
S PAAPTCE VKS CDD FMGQLLNGR VLFP VNLQLGAKVD FVCDEG F 
QLKGSS AS YCVLAGMESLWNSS VPVCEQ I FCPSPPVI PNGRHTG 
KPLE VFP FGKAVNYTCDPHPDRGTS FDL I GEST IRCTSDPQGNG 
VWSSPAPRCGILGHCQAPDHFLFAKLKTQTNASDFPIGTSLKYE 
CRPEYYGRPFS ITCLDNLVWS S PKDVCKRKSCKTPPDPVNGMVH 
VITDIQVGSRINYSCTTGHRLIGHSSAECILSGNAAHWSTKPPI 
CQRIPCGLPPTIANGDFISTNRENFHYGSWTYRCNPGSGGRKV 
FELVGEPS I YCTSNDDQVGI WSGPAPQCI IPNKCTPPNVENGIL 
VSDNRS LFSLNE WE FRCQPG FVMKGPRRVKCQALNKWEPELPS 
CSR VCQ P P PDVLHAE RTQRDKDNFSPGQE VFYSCE PG YDLRGAA 
SMRCTPQGDWS PAAPTCEVKS CDDFMGQLLNGRVLFP VNLQLGA 
KVDFVCDEGFQLKGSSASYCVLAGMESLWNSSVPVCEQIFCPSP 
PVIPNGRHTGKFLEVFPFGKAVNYTCDPHPDRGTSFDHGESTI 
RCTSDPQGNGVWSS PAPRCG I LGHCQAPDHFLFAKLKTQTNASD 
FP IGTSLKYECR PE YYG RPFS I TCLDNLVWSS PKDVCKRKSCKT 
PPD P VNGMVHVITD I QVGSR INYS CTTGHRLIGHSS AE C I LSGN 
TAHWSTKPPICQRIPCGLPPTIANGDFISTNR3NFHYGSVVTYR 
CNLGSRGRKVFELVGEPSI YCTSNDDQVG IWSGPAPQCI IPNKC 
TP PNVENG ILVSDNRS LFSLNE WEFRCQ PG FVMKGPRRVKCQA 
LNKWEPBLPSCSRVCQPPPEILHGEHTPSHQDNFSPGQEVFYSC 
EPGYDLRGAASLHCTPQGDWS PEAPRCAVKS CDDFLGQLPHGRV 
L FPLNLQLGAKVS FVCDEG FRLKGS SVSHCV LVGMRSLWNNS VP 
VCEHIFCPNPPAI LNGRHTGTPSGDI PYGKE IS YTCDPHPDRGM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenylalanine, G-Glycine, 
HsHistidine, I^Isoleucine, K=Lysine, 
Iis=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine f V=Valine, 
W^Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TFNLIGESTIRCTSDPHGNGVWSSPAPRCELSVRAGHCKTPEQF 
PFASPTI P INDFEFPVGTSLNYECRPGYFGKMFS ISCLENLVWS 
SVEDNCRRKSCGPPPEPFNGMVHINTDTQFGSTVNYSCNEGFRI, 
IGSPSTTCLVSGNNVTWDKKAPICEIISCEPPPTISNGDFYSKN 
RTS FHNGTVVTYQCHTGPDGEGLFELVGERS I YCTSKDDQVGVW 
SSPPPRCISTNKCTAPEVENAIRVPGNRSFFSLTEIIRFRCQPG 
FVMVGS HTVQCQTNGRWGP KLPHCSRVCQP P PE I LHGEHTLSHQ 
DNFS PGQE VF YSCEPS YDLRGAAS LHCTPQGDWS PEAPRCTVKS 
C33DFLGQLPHGRVLLPLNLQLGAKVSFVCDEGFRLKGRSASHCV 
LAGMKALWNSSVPVCEQIFCPNPPAILNGRHTGTPLGDIPYGKE 
VS YTCD PHPDRGMTFNL IG EST I RRTS EPHGNGVWSS PAPRCEL 
PVGAACPHP PKIQNGHYI GGKVSLYLPGMTIS YTCDFGYLLVGK 
GF I FCTDQG IWSQLDHYCKEVNCSFPLFMNG I SKELEMKKVYHY 
GDYVTLKCEDGYTLEGSPWSQCQADDRWDPPLAKCTSRTHDAIjI 
VGTLSGTIFFI LLI I FLSWI I LKHRKGNNAHENPKEVAI HLHSQ 
GGSSVHPRTLQTNEENSRVLP 


5942 


4509 


! 688 


YLYVRMRANPLAYGISHKAYQIDPPL\RKHREQ\LVIE\VGRKL 
DK\AQM I RFEERTG YFSSTDLGRTASHYYIKYNTI ETFNELFDA 
HKTEGDIFAIVSKAEEFDQIFCVREEEIEEI>DTI»LSNFCEI»STPG 
GVENSYGKINILLQTYINRGEMDSFSLISDSAYVAQNAARIVRA 
LFE I ALRKRWPTMT YRLLNLS KAIDKRLWGWAS PLRQFS I LPPH 
MLTRLEEKKLTVDKIjKDMRKDEIGHILHHVNIGLKVKQCVHQIP 

svmmeafiqpitrtvlrvtlsiyadftwndqvhgtvgepwwiwv 

EDPTNDHI YHSE YFLALKKQVI SKEAQLLVFTI P I FEPLPSQYY 
IRAVSDRWLGAEAVCI 33* FQHT,ILPERHP PHTELLDLQPLP ITA 
LGCKAYEALYNFSHFNPVQTQI FHTLYHTDCNVLLGAPTGSGXT 
VAAELAI FRVFNKYPTSKAVYIAPLKALVRERMDDWKVRI EEKXj 
GKKVIELTGDVTPDMKSIAKADLIVTTPEKWEGVSRSWQNRNYV 
QQVT 1 III IDE IHLLGEERGP VLEVI VSRTNFIS SHTEKPVR I VG 
LSTALANARDLADWLNI KQMGL FNFRPS VRPVPLEVHIQGFPGQ 
HYCPRMASMNKPAFQAIRSHSPAKPVLIFVSSRRQTRLTALELI 
AFLATEEDPKQWLNMD BREMEN I IATVRDSNL.KLTLAFGIGMHH 
AGLHERDRKTVEELFVNCKVQV:,! ATST1AWGVNFPAHLVI I KG 
TEYYDGKTRRYVDFPITDVLQMMGRAGRPQFDDQGKAVILVHDI 
KKDFYKKFI.YEPFPVESSLLGVLSDHLNAEIAGGTITSKQDALD 
YITOTYFFRRLIMNPSYYNLGDVSHDSVNKFLSHIilEKSLIEIiE 
LS YCIE IGEDNRS I E PLTYGR I AS YY YLKHQTVKM FKDRLKPEC 
STEELLS IIjSDAEEYTDIiPVRHNEDHMNSEIiAKCLP I ESNPHS F 
DSPHTKAHLLLQAHLSRAMLPCPDYDTDTKTVIiDQALRVCQAMIi 
DVAANQGWIiVT\n^ITNLIQWIQGRWIiKDSSLLTLPNI ENHHL 
HLFKKWKPIMKGPHARGRTSIECLPELIHACGGKDHVFSSMVES 
ELHAAKTKQAWNFLSHLPEINVGISVKGSMDDLVEGHNELSVST 
LTADKRDDNKWIKLBADQEYVLQVSLQRVHFGFHKGKPESCAVT 
PRFPKSKDEGWFLILGEVDKRELIALKRVGYIRNHHVASbSFYT 
PEIPGRYIYTLYFMSDCYLGLDQQYD/NLSQRYTSESFCTGQHQ 
GIi 


5943 


1 


2274 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQTWLPNHVVFLRLR 
EGLKNQS PTEAEKPASSSLPSS PPPQLLTRNWFGLGGELFLWD 
GEDSSFLWRLRG PSGGG\ EEPALSQYQRLLCINPP LFE I YQVL 
LS PTQHHVALI GI KGLWVLELPKRWG KNSEFEGGKSTVNCSTTP 
VAERFFTSSTSLTLKHAAW YPSE I LDPHWLLTSDNVIR I YSLR 
EPQTPTNVIILSEAEEESLVLNKGRAYTASLGETAVAFDFGPLA 
AVPKTLFGQNGKDEVVAYPLYI LYENGETFLTY I S L LHS PGN/ 1 
WKAVGS IAHAS \AAEDNYG YDACAVLCLPCVPNI LVIATESGML 

yhcvvltceeeddhtsekswdsridlipslyvfecvelelalkl 
asgeddpfdsdfscpvklhrdpkcpsryhctheagvhsvgltwi 

HKLHKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLPCRQPAP 
IRGFWI VPDILGPTMICI TSTYECLI WPLLSTVHPAS PPLLCTR 
EDVEVAES PLRVLAETPDS FEKHIRS I LQRSVANPAFLKAS EKD 
I AP P PEECLQLLSRATOVFREQYI LKQDLAKEE I QRRVKLLCDQ 
KKKQLEDLS YCREERKSLREMAERLADKYEEAKEKQED I MNRMK 
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SEQ 
ID 
NO: 


Predicted 
beginning 

n t i <-» 1 o^t* \ /4a 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic. Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, Nr=Asparagine, 
P*=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLLHSFHSELPVLS DSERDMKKELQL I PDQLRHLGNAI KQ VTMK 
KDYQQQKMEKVLSLPKPTI 1 1»S AYQRKCIQS I I*KEEGEH I REMV 
KQ IND I RNHVNF 


5944 


167 


3428 


FS I ATFTDEPEVLTEPPS ATTTTT I G I S AT WTTLAGSHGKRNNT 
ITTTS S KRKNRKNKI T P ENVQ 1 1 FDDPLPI S YS QPE KVNG ES KS 
SSTSESGDSDNMRISSCSDES SNSNSSRKSDNHS PAWTTTVS S 
KKQ PS VLVTF P KE ERKS VSGKAS I KLS ETISEGTSNSLS TCTKS 
GPS PLS S PNGKLT VAS PKRGQKRE EG W KEWRR S KKVSVPSTVI 
SR V IGRGGCN I NAI R EFTG AH I DI DKQKDKTGDR 1 1 T I RGGTE3 
TRQATQLINALIKDPDKEIDELIPKNRLKSSSANSKIGSSAPTT 
TAANTSLMGIKMTTVALSSTS QTATALTVPAI S S AS THKT I KNP 
VN\NVRPGFPVSFP\ljAYPPPQFAHAIiIiAAQTFQQIRPPRLPMT 
HFGGTF P PAQS TWGP FP VRPLS PARATNS PKPHMVPRHSNQNSS 
GSQVNSAGSLTSSPTTTTSSSASTVPGTSTOGSPSSPSVRRQLF 
VTWKTSNATTTTVTTTASNNNTAPTNATYPMPTAKEHYPVSS? 
SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPWET 
TNTRPPNSSSSSGSSSAHSNQQQPPGSVSQEPRPPLQQSQVPPP 
EVRMT VP PLATS SAPVAVPS TAP VTYPMPQTPMGCPQPTPKMET 
PAIRPPPHGTTAPHKNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 
LPSTLS TQS ACQNS VHPANK P I APNFSAPL PFG P FSTLFENS PT 
SAHAFWGGSVVSSQSTPESMIjSGKSSYIjPNSDPIaHQSDTSKAPG 
FRP PLQRP A PS PSG I VNMDS P YGSVTPSSTHLGNFASNISGGQM 
YGPGAPLGGAPAAANFNRQHFSPLSLLTPCSSASNDSSAQSVSS 
GVRAPS PAPSSVPLGSEKPSNVSQDRKVP VPIGTERSARI RQTG 
TSAPSVIGSNLSTSVGHSGIWSFEGIGGNQDKVDWCNPGMGNPM 
IHRPMSDPGVFSQHQAMERDSTGIVTPSGTFHQHVPAGYMDFPK 
VGGMPFSVYGNAMIPPVAPIPDGAGGPIFNGPHAADPSWNSL1K 
MVS SSTENNGPQTVWTGPWAPHMNS VHMNQLG 


5945 


1461 


197 


GVTHLFLFGKRKLRKGIAEDLKGQADFFFLLVSELAVVATGSPRA'' 
WLTCLI LPLPGI I FS VLPKAMSRPLLITFTPATDPSDLWKDGQQ 
QPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARER 
KRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRI LRAA 
QEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQG 
AAVS YLI/3 RGAAWVG VCS LSGRDAAQIiAEEAG F P E VARMVRE S H 
GETRSPENRS PTPS LQ YCENCDTH FQDSNHRTSTAKLI»S LSQG P 
QPPNLPLGVP I SSPG FKLLLRGGWEPSMGLGP RGEGRAWP I PTV 
LKRDQEGLGYRS APQ PRVTHF PAWDTRAVAGRE \ TP PRVATLS W 
REERRREE \KDRAWERDLRTYMNLEF 


5946 


541 


1666 


ilgsyssiqpeeysvsvvcVevvlqdliaVyvspkXhsylrdlp 
segspqrvnsidfvXblVekxqpdvlvhavlrwdf/tilteav 
ysyrgqkqkkvmltveqaqdqhyalvlwgpgaaw\ypqlqrkkg 

YI WEFKYLFVQCNYTIjENLELHTTP WSSCE CXiFDDD I RAI TFKA 

kfqksapsfvkisdiathledkcsgvvlikaqiselafpitasq 
kialnahsslks ifsslpni vytgcakcx3lei.etdenri ykqcf 
sclpftmkkiyyrpalmtaidgrhdvcirveskliekillnisa 

DCLNRVIVPSSEITYGMVVADLFHSLLAVSAEPCVIiKIQSLFVL- 
DENSYPLQQDFSLLDFYPDIVKHGANARIi 


5947 


3 


1317 


RG I PDRRRRG P IGRVNMDItENKVKKMGLGHEQGFGAPCLKCKEK 
CEG FELHFWRKI CRNC \NVAKKSM / TVLLSNE E DRKVGKLF3DT 
KYTTLIAKLKSDG1 PMYKRNVMILTNP VAAKKNVS INTVTYEWA 
PPVQNQAIiARQYMC3MIiPKEKOPVAG<;Pf;anvDirK'nT avm D7VLm 
QDPSKCHEl^PREVKEMEQFVKKYKSEALGVGDVKLPCEMDAQG 
PKQMNIPGGDRSTPAAVGAMEDKSAEHKRTQYSCYCCKLSMKEG 
DPAIYAERAGYDKLWHPACFVCSTCHELLVDMIYF^KNEKLYCG 
RHYCDS EKPRCAGCDEIj I FSNBYTQ AENQNWHLKHFCCFDCDS I 
LAGE I YVMVNDKP VCKP CYVKNHAVVCQGCHNAI DP EVQRVT YN 
NFSWHASTECFLCSCCSKCIiIGQKFMPVEGMVFCSVECKKRr>lS 


5948 


39 


3370 


YRER YP VSGGS VLRSALE VCWDFLSGLTEGSLLPEGFFS GP I DQ 
GNHYQMRRXGRCHRGSAARHPSSPCSVKHSPTRETLTYAQAQRM 
VEIEIEGRLHRIS IFDPLEI ILEDDLTAQEMSECNSNKENSERP 
PVCLRTKRHKNITOVKKICNEALPSAHGTPASASALPEPKVRIVEY 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E~ 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=» Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SPPSAPRRPPVYYKFIEKSAEELDNEVEYDMDEEDYAWLEIVNE 
KRKGDC VP AVS QSMFE F LMDRFEKE SHCENQKQGEQQS L IDED A 
VCCI CMDGECQNSNVILFCDMCNLAVHQECYG VPYI PEGQWIjC/ 
RAHCLQSRARPADCVliCPNKGGAFKKTDDDRWGHV\VCALW\ I P 
E \ VG FANTVFI EP I DG VRNI P PARWKLT\ CNLCKE KGR / VGAC I 
QCHKANC YTAFHVTCAQKAGLYM KME PVKELTGGGTTFS VRKTA 
YCDVHTP PGCTRRPLNI YGDVEMKNGVCRKESSVKTVRSTSKVR 
KKAKKAKKALAE PCAVLPT VCAPYI PPQRLNRIANQVAIQRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLLI ELURKREKLKREQVKVEQVA 
MELRLTPLTVLLRSVLDQLQDKDPARIFAQPVSLKEVPDYLDHI 
KHPMDFATMRKRLEAQGYKNLHEFEEDFDLIIDNCMKYNARDTV 
FYRAAVRLRDQGGVVLRQARREVDSIGLEEASGMHLPERPAAAP 
RRPFSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 
SRS KRAKLLKKE I ALLRNKIiSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGE E VI. PRLETLLQPRKRSR S TCGDSE VEEES PG KRLDAGL 
TNG FGGARS EQE PGGGLGR KATPRRRCASE S S I S S SNS P LCDSS 
FNA PKCGRG KPALVRRHTLEDRS EL I S CI ENGN YAKAAR I AAEV 
GQSSMWI 5TDAAAS VLEPLKVVWAKCSGYPS YPALI IDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 
WLPKSKMVPLGIDETIDKLKMMEGRNSSIRKAVRIAFDRAMNHL 
SRVHGEPTSDLSDID 


5949 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ 
GNHYQMRRKGRCHRGS AARHPSS PCS VKHS PTRETLTYAQAQRM 
VEIEIEGRLHR IS I FDPLE 1 1 LEDDLTAQEMSECNSNKENS ERP 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 
SPPSAPRRPPVYYKFIEKSAEELDNEVEYDMDEEDYAWLEIVNE 
KRKGDC V PAVSQSM FEFLMDRFEKESHCENQKQGEQQSL I DEDA 
VCCI CMDGECQNSNVI LFCDMCNLA VHQE CY G VP Y I PEGQWLC/ 
RAHCLQS RARP AD CVLCPNKGGAFKKTDDDRWGHV\ VCALW \ I P 
E \ VG FANTVFIE P I DGVRN I PPARWKLT \ CNLCKEKGR/ VGACI 
QCHKANCrrAFHVTCAQKAGLYMKMEPVKELTGGGTTFSVRKTA 
YCDVHTPPGCTRRPLNI YGDVEMKNGVCRKESSVKTVRSTSKVR 
KKAKKAKKALAEPCAVL PTVCAP Y I P PQRLNRI ANQVAIQRKKQ 
FVERAHSYWLLKRLSRNGAPLIjRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLL 1 ELLRKRE KLKREQVKVEQVA 
MELRLTPLTVLLRS VLDQLQDKDPAR 1 FAQPVSLKEVPDYLDHI 
KHPMDFATMRKRLBAQGYKNLHE FEED FDLI IDNCMKYNARDTV 
FYRAAVRLRDQGG WLRQARREVDS IGLEEASGMHLPERPAAAP 
RRP FSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 
SRSKRAKLLKKEIALLRNKLSQQHSQPLPTGPGIiEGFEEDGAAL 

gpeageevlprletllqprkrsrstcgds evbees pgkrldagl 
tngfggarseqepggglgrkatprrrcasessisssnsplcdss 
fnapkcgrgkpalvrrhtledrselisciengmyakaariaaev 
gqssmwistdaaasvleplkvw/akcsgypsypaliidpkmprv 
pghhngvtipappldvlkigehmqtksdeklflvlffdnkrswq 
wlpkskmvplgidetidklkmmegrnssirkavriafdramnhl 
srvhgeptsdlsdid 


5950 


1166 


373 


esrsltmstsqpgacpcqgaasrpailyallssslkavprprsr 
clcrqhrpvqlc^phrtcreaijdvlaktvaflrnlpsfwqlppq 
dqrrllqgcvjgplfllglaqdavtfevaeapvpsilkkilleep 
sssggsgqlpdrpqpslaavqwlqccles fwslelspke \ YACL 
KG pi lfnpdvpglqaashighlqqeahwvlcevlepwcpaaqgr 
ltrvlltastlks I ptsllgdlffrpi igdvdiagllgdmlllr 


5951 


143 


5449 


wnvkpsllwqlfkfsdkeeheqnds I sgktgetgveemiatrk 
veqdsketvklsheddhi ledagssdi ssdaactnpnktenslv 
glpscvdevtecnlelkdtmgiadktentlernkieplgyceda 
esnrqlestefnksnlewdtstfgpesnilenaicdvpdqnsk 
qlnaiestkieshetanlqddrnsosssvsylesksvkskhtkp 
vihskqnmttdapkktvaakyevihsktkvnvksvkrntdvpes 
q^nfhrpvkvrkkqidkepkiqscnsgvksvknqahsvlkktlq 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G-Glycine, .* 
H^Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N-Asparagine, 
P= Proline, Q=G lutamine, R=Arginine, 
S=Serine, T=Threonine , V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5952 






UQTLVQIFKPL,THSLSUKSHAHPGCLKEPHHPAQTGHVSHSSQK 
QCHKPQQQAPAMKTNSHVKBELEHPGVEHFKEEDKLKLKKPEKN 
LQPRQRRSSKSFSLDBPPLFIPDNIATIRREGSDHSSSFESKYM 
WTP3 KQ CG FCKKPHGNRFM VG CGRCDDW FHGDCVGLS LSQAQQM 
GEEDKEYVCVKCCAEEDKKTEILDPDTLENQATVEFHSGDKTME 
CE KLGLS KHTTNDRT KYI DDT VKHKVKI LKR ESG EGRNS SDCRD 
NEIKKWQLAPLRKMGQPVLPRRSSEEKSEKIPKEST^VTCTGEK 
AS K PGTHE KQEMKKKKV \ EKG VLNVHPAAS AS KPS ADQ I RQS VR 
HS LKDILMKRLTDSNIiKVPEEKAAKVATKI EKELFS FFRDTDAK 
YKNK YRS LM FNLKD PKNNI LFK KVLKGEVTPDHLI RMS P EELAS 
KELAAWRRRENRHTIEMIEKEQREVERRPITXITHKGEIEIESD 
APMKEQEAAMEIQEPAANKSLEKPEGSEK\RKEEVDSMSKDTTS 
QHRQHLFDLNCKI CIGRMAP P VDDLS P KKVKWVG VAR KHSDNE 
AESIADALSSTSNILASEFFEEEKQESPKSTPSPAPRPEMPGTV 
EVES T FLARLN FI WKGFI NMPS VAKFVTKAY? VSGS PE YLTEDL 
PDS I Q VGGR I S PQTVWD YVE K I KAS GT KE I CWR FTP VTEEDQ I 
S YTX.LFAYFS SRKR YG VAANNMKQ VKDMYIi I PLGATDKI *>HPLV 
PFDG PGLELHRPNLLLGL I IRQKLKRQHSACASTSH I AETPESA 
PPIALPPDKKSKIEVSTEEAPEEENDFFNSFTTVLHKQRNKPQQ 
NLQEDLPTAVEPLMEVTKQEPPKPI.RFLPGVLIGWENQPTTLEL 
ANKPLPVDDILQSLI^TTGQVYDQ\AQSVMEQNTVKBIPFLNEQ 

tnskiektdnvevtdgenkeikvkvdnisestdksaeietswg 
sss isagsltslslrgkppdvsteafltnls iqskqeetveske 
ktlkrqlqedqemnlqdmqtsnsspcrsnvgkgnidgnvscsen 
lvantarspqfinlkrdprqaagrsqpvttseskdgdscrngek 

HMLPGLSHNKEHLTEQINVEEKLCSAEKNSCVQQSDNLKVAQNS 
PSVENIQTSQAEQAKPLQEDILMQNIETVHPFRRGSAVATSHFE 
VGNTCPSEFPSKSITFTSRSTSPRTSTNFSPMRPQQPNLQHLKS 
S PPGFP FPG PPNFPPQ SMFGFPPHLP PPLLPPPGFG \ FA\ QNPM 
VPWP PW\HIiP\ GQPQRMMGPI/SQASRYIGPQNF YQVKDIRRPE 
RRHSDPWGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKRERHEKE 
WEQESERHRRRDRSQDKDRDRKSREEGHKDKERARLSHGDRGTD 
GKASRDSRNVDKKPDKPKSBDYEKDKEREKS KHREGEKDRDR YH 
KDRDHTDRTKSKR 


5953 


3226 


639 


PPARRSARDI»PRALSMEAARPSGSWNGALCRLL\IjVTI»\aFLI F 

ASDACKNVTLHVPSKLDAEKLVGRVNLKECFTAANLIHSSDPDF 

QILEDGSVYTTNTII^SEKRSFTILLSNTENQEKKKIFVFLEH 

OTKVLKKRHTKEKVIJmAKRRWAPIPCSMLENSLGPFPLFLQQV 

QSDTAQNYT I Y YS I RGPGVDQEPRNLFYVERDTGNL YCTRP VDR 

EQYES FE I IAFATTPDG YTPELPLPL I 1 KI EDENDNYP I FTEET 

YTFTIFENCRVGTTVGQVCATDKDEPDTMHTRLKYSIIGQVPPS 

PTLFSMHPTTGVI TTTSSQLDRELIDKYQLKI KVQDMDGQ YFGL 

QTTSTCI INIDDVNDHLPTFTRTSYVTSVEENTVDVEILRVTVE 

DKDLVNTANWRANYTILKGNENGNFKIVTDAKTNEGVLCVVKPL 

NYEEKQQMIIiOIGVVNEAPFSREASPRSAMSTATVTVNVEDQDE 

GPECNPPIQTVRMKENAEVGTTSNGYKAYDPETRSSS3IRYKKL 

TDPTGWVTIDENTGS IKVFRSLDREAETI KNGI YN I TVLASDQG 

GRTCTGTLGI ILQDVNDNS PFI PKKTVI ICKPTMSSAEIVAVDP 

DEPIHGPPFDFSLESSTSEVQRMWRLKAINDTAARLSYQKDPPF 

GSYWPITVRDRLGMSSVTSLDVTLCDCITENDCTHRVDPRIGG 

GGVQLGKWAIIiAILLGIALFFCILFTLVCGASGTSKQPKVIPDD 

IiAQQNLIVSNTEAPGDDKVYSANGFTTQTVGASAQGVCGTVGSG 

1 KNGGQETI EMVKGGHQTSESCRGAGHHHTLDS CRGGHTEVDNC 

r ytysewhs ftqprlgees IRGHTLIKN 


5954 


330 
32' 


811 


P LLCNPDPG W Y W WVKQESE I S KESQEM DARPKLDLGFKEGQT I K 
LCIGNI TNKKGGASKPRTARGGGLS LLP PPPGGKVTI PPPSS / V 
KLPSTNHVTPPS I PKSNHGGSDADILLDLDS PAP VTTPAPTP VS 
VSNDLWGDFSTASS S VPNQAPQPSNWVQF 






2130 

j 


PPPPPPKLANMADLEAVLADVSYIJV1AMEKSKATPAARASKRIVL 
PEPSIRSVMQKYIAERNEITFDKIFWQKIGFIiFKDFCLNEINE 
WPQVKFYEEIKEYEKLDNEEDRLCRSRQIYDAYIMKELLSCSH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=> Tryptophan, Y=Tyrosine, X^- Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PFSKQAVEHVQSHLSKKQVTSTLFQPYI eeiceslrgdi fqkfm 
ESDKFTRFCQWKNVELNIHLTMNEFSVHRIIGRGGFGEVYGCRK 
ADTGKMYAMKCLNKKR I KMKQGETLALNERI MLSLVSTGDCPFI 
VCMT YAFHTP DKLCF I LDLMNGGD LH YHLSQHG VFS EKEMRFYA 
TE 1 1 LGLEHMHNRFWYRDLKPANILLDEHGHARI S \DLGLACD 
FSKKKPHASVGTHGYMAPEVLQKGTAYDSSADWFSIX5CMLFKLI> 
RGHSPFRQHKTKDKHEIDRMTLTVNVELPDTFSPELKSLLEGLL 
QRDVSKRLGCHGGGSQEVKEHSFFKGVDWQHVYLQKYPPPLIPP 
RGEVNAADAFDIGSFDEEDTKGIKLLDCDQELYKNFPLVISERW 
QQEVTETVYEAWADTDKIEARKRAKNKQLGHEEDYALGKDCIM 
HGYMLKLGNPFLTQWQRRYFYLFPNRLEWRGEGESRQNLbTMEQ 
ILSVEETQIKDKKCILFRIKGGKQFVLQCESDPEFVQWKKELNE 
TFKEAQRLLRRAPKFLNKPRSGTVELPKPSLCHRNSNGIi 


5955 


1726 


444 


KREREFRLAVCPLRYPSAYESSPGTELRECGLCRSGQEFADCRR 
PANRQDVLSGVJINIiPVLQLTKDPLKTPGRLDHGTRTAFIHHREQ 
VWKRCINIWRDVGLFGVLNEIANSEEEVFEWVKTASGWALALCR 
WASSLHGSLFPHLSLRSEDLIAEFAQVTNWSSCCLRVFAWHPHT 
NKFAVALbDDSVRVYNASSTIVPSLKHRLQRNVASLAWKPLSAS 
VLAVACQSCILZWTLDPTSLSTRPSSGCAQVLSHPGHTPVTSLA 
WAPSGGRLLSASPVDAAIRWDVSTETCVPLPWFRGGGVTNLLW 
SPDGS KILATTPSAVFRWJEAQMWTCERWPTIjSGRCOTGCWS pd 
GSRLLFTVLGEPLI YSLS FPERCGEGKG\ ALEVQSQQRLWQ I CL 
RQQYRHQMVRRGLGERLTPWSGTPVGNVWIjCL 


5956 


1705 


13 9 


GVG VRGARAMATVQEKAAALNLSALHS pahr PPG fs vaqkp fga 
TYVWSSI INTLQTQVEVKKRRHRLKRHNDCFVGSEAVDVI FSHL 
IQNKYFGDVDI PRAKWRVCQALMDY KVFEAVPT KVFGKDKKPT 
FEDSSCSLYRFTT I PNQDSQLGKENKLYS PARYADALFKS sdir 
SASLEDLWENLSLKPANSPHVNISATLSPQVINEVWQEETIGRIj 
LQLVDLPLLDSLIiKQQEAVPKI PQPKRQSTMVNS SNYLDRGILK 
AYSDSQEDEWLSAAIDCSE YLPDQM WEI SRS FPEQPDRTDLVK 
ELLFDAIGRYYSSREPLLNHLSDVHNGIAELLVNGKTEIALEAT 
QLLLKJjLD FQNREEFRRLL YFMAVAANPS EFKLQ KES DNRM WK 
RI FSKAI VDNKNLS XGKTDLLVLFlAMDHQKDVFKI PGTL\HKI 
VS \ VK\ LMAIQNGRDPNRDAG Y I YCQRI DQRDYSNNTEKTTKDE 
LLNLLKTLDEDS KLS AKEKKK\LLGQ F YKCHPDI FI EHFGD 


5957 


1479 


451 


ELQVAVAMDTLDRWKPKT KRAKRFLEKREPKLN EN I KNAML I K 
GGNANATVTKVLKDVYALKKPYGVLYKKKNITRPFEDQTSLEFF 
SKKSDCSLFMFGSHNKKRPNNliVIGRMYDYHVLDMIELGIENFV 
SLKDI KNSKCPEGTKPMLI FAGDDFDVTEDYRRLKS LLID FFRG 
PTVSNIRLAGXjEYVI*HFTAI»NGKIYFRSYXLLIiKKSGCRTPRIE 
LEEMGPSI^LVLPJiTHLASDDLYKLSMKMPKALKPKKKKNISHD 
TFGTTYGRI HMQKQDLSKLQTRKM\ KGLKKRPAER IT3DHEKKS 
KRI KKKLMELSQ PIaLFHCVIiIiKRI IKHQS I QSFI* 


5958 


1 


3138 


AAALGWIiliWFPACOAFNLDVEKLTVYSGPKGSYFGYAVDFHIPD 
ARTASVLVGAPKANTSQPD I VEGGAVY YCPWPAEGS AQCRQI PF 
DTTNNRKI RWGTKEPI EFKSNQWFG\ATVKA\HKGKSCGPVAP 
liLFTW RNFLKPT PE KGP VGTC YVAIQNFS AYAEFS PCGNSNADP 
EGQGYCQAGFSLDF YKNGDLI VGG PGS F YW QGQVI TASVAD I I A 
NYSFKDILRKLAGEKQTEVAPASYDDSYLGYSVAAGEFTGDSQQ 
ELVAG IPRGAQNFGYVS I INS YDMTFIQNFTGEQMAS YFGYTVV 
VSDVNSDGLDDVLVGAPLFMEREFESNPREVGQIYLYLQVSSLL 
FRDPQ ILTGTETFGRFGS AMAHLGDLNQDGYNDIAI GVPFAGKD 
QRGKVLIYNGl!raDGI^TKPFPKFCG<3VWASHAVPSGFGFTLRGD 
SDIDKNDYPDLITVGAFGTGKVAVYRARPVVTVDAQLLLHPMI IN 
LENKTCQVPDSMTSAACFSLRVCASVTGQSIANTIVLMAEVQLD 
S LKQKGAI KRTLFLDNHQ AHRVFP LVIKRQKSHQCQDF I VYLRD 
ETEFRDKLSPINISLN YSLDESTFKEGLEVKP I LNY YRENIVSE 
QAHI LVDCGEDNLCVPDLKLSARPDKHOVI IGDENHLMLI INAR 
NEGEGAYEAELFVMI PEEADYVGI ERNNKGFRPLSCEYKMENVT 
RMVVCDLGNPMVSGTNYSLGLRFAVPRJjEKTNMSINFDLQIRSS 
N KDN PDSNFVS LQINITAVAQVE I RGVSHP PQ I VLP IHNWEPEE 
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SEQ 
ID 
NO: 




Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
I sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C~Cysteine, D=Aspartic Acid, E=, 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline . 0=Gliir*AmTno d*_b wrim-s 
S-Serine, T^Threonine , V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *~Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








E PHKEEB VG P LVEH I YELHN I G PS T I SDT I LE VGWP FS ARDE FL 
LYIFHIQTLGPU3CQPNPNINPQDIKPAASP2DTPELSAFLRNS 
TI PHLVRKRDVHWEFHRQSPAKI LNCTNI ECLQI SCAVGRLEG 
GE S AVLKVRS R LWAHTFLQR KND P YALAS L VS FEVKKMP YTDQ P 
AKLPEGSIAIKTSVIWATPNVSFSTPI.WVTTT rtttpt tvt&tt 
TLALWKCG FFDRAR PPQEDMTDREQI/FNDKTPEA 


595S 


1 


1 1166 


GTSGYAAQQLPSUbKEREFHLGTIiNKVFASQWLNHROWCGTKC" 
NTLFWDVQTSQITKI P I LKDREPGGVTQQGCG IHAIELNPSRT 
LLATGGDNPNS LAI YRLPTLDP VCVGDDGHKDWI FS I AW ISDTM 
AVSG S RDGS MGLWE VTDDVLTKS DARHNVS RVP VYAHI THKALK 
DI PKEOTNPDNCKVRALAF^KNKEL^AVSLDGYFHLWKAENTL 
SKLLSTKLPYCRENVCLAYGSEWSVYAVGSQAHVSFLDPRQPSY 
NVKSVCSRERGSGIRSVSFYEHIITVGTGQGSLLFYDIRAQRFL 
EERLS ACYGSKPRLAGENLKLTTG \ KGWLNHDETWRNYFSDIDF 
FPNAVYTHCYDSSGTKLFVAGGPLPSGLHGNYAGLWS 


5960 
5961 


2853 


! 870 


F VWS DGGPRPRRGPAVGAGAAKLS DP WAMTPGTANRATNPLNKE 
LD WAS ING FCBQLNEDFEGP PLATRLLAHKIQS PQEWEAI QALT 
VLETCMKS CGKRFHDE VGKFRFLtJ E L I KWS P K YLGS RTS EKVK 
NKILELLYSWTVGLPEEVKIAEAYQMLKKQG\IVKSDPKLPDDT 
TFPLPPPRPKNVT FEDEEKSKMIiARLLKSSHPEDLRAANKLIKE 
i*. vy fcijy KKM£K± S KR VNAI EEVNNNVKLLTEMVMSHSQGGAAAG 
S S EDL \ MKE L \ YQRCERMR PTLFPTGRVDTEDND\ EALAE I LQA 
NDKLTQVINLYKQLVRGEEVNGDATAGSIPGSTSALLDLSGLDL 
P PAGTT Y PAM PTRPGEQAS PEQ PS AS VSLLDDELMS LGLS D PTP 
PS G P S LDGTGWNS FQSS DATEP PAPALAQAPSM ESR P PAQTSLP 
ASSGLDDLDLLGKTLLQQSLPPESQQVRWEKQQPTPRLTLRDLQ 
NKSSSCSSPSSSATSLLHTVSPEPPRPPQQPVPTELSLASITVP 
LESIKPSNILPVTVYDQHGFRILFHFARDPLPGRSDVLWWSM 
LSTAPQ PI RN I VFQS AVP KVMKVKLQ P PSGTELPAFNPI VHP SA 

ITQVLLLANPQKEKVRLRYKLTFTMGDQTYNEMGDVDQFPPPET 
WGSL 




198 


3147 


SGE PRPEPGNMATCIGEKI EDFKVGNLLGKGS FAG VYRAES I HT 
GLEVAIKMIDKKAMYKAGMVQRVQNEVKIHCQLKHPSILELYNY 
FEDSNYVYLVLEMCHNGEMNRYLKNRVKPFSENEARHFMHQIIT 
GMLYLHSHG I LHRDLTLSNLLLTRNMN I KIADFGIATQL KMPHB 
KH YTLCGTPNY I S PEIATRS AHGLESDVWS LGCMFYTLL I GR P P 
FDTDTVKNTLWKVVLAD YEMPTFLS IEAKDLIHQLLRRN'PADRlj 
SLS S VLDHPFMSRNS STKS KDLGT VEDS IDSGHATI STAI TAS S 
STS I SGS LFDKRRLLiIGQPLPNKMTVFPKNKSS TDFSS SGDGNS 
FYTQWGNQETSNSGRGRVIQDAEERPHSRYLRRAYSSDRSGTSN 
SQSQAKTYTMERCHSAEMLS VSKRSGGGENEER YS PTDNNANI F 
NFFKEKTS SSSGSFERPDNNQALSNHLCPGKTP FP FADPTPQTE 
TVQQWFGNLQ I NAHLRKTTE YDS ISPNRDFQGHPDLQKDTSKNA 
wu 'w«*»uwflHi) v KQQNTMKYMTALHSKPE I IQQECVF 
GSDPLSEQSKTRGM3PPWGYQNRTLRSITSPLVAHRLKPIRQKT 
KKAWS I LDSEEVCVELVKE YASQE YVKE VLQ I S SDGNTI T I YY 
PNGG\RGFPLA\DRPPSPT\DNISR\YSF\DNLPEKYWRKYQYA 
SRFVQLVRSKS PKIT YFTRYAKCILMENS PGADFEVWFYDGVKI 
H KTEDF I QVIE KTGKS YTLKS ES EVNSLKEE IKM YMDHANEGHR 
ICLALES I ISEEERKTRSAPFFPI I IGRKPGSTSSPKALSPPPS 
VDSN YPTRDRAS FNRMVMHSAAS PTQAP ILNPS M VTN3GLGLTT 
TASGTD I S SNSLKDCLPKSAQLLKS VFVKNVGWATQ\ LTSGAVW 
VQFNDGSQLWQAGVSSISYTSPNGQ\TTR\YGENEKLPDYIKQ 
KLQCLSS ILLMFSNPTPNFH 




5962 


20 


2447 

] 


RVCSS S AS TASQAVMADAWE E IRRLAADFQRAQFAEATQRLS ER " 
MCIEIVNKLIAQKQLEWHTLDGKEYITPAQISKEMRDELHVRG 
GRVNIVDLQQVINVDLlHIENRIGDIIKSEiaiVQLVLGQLIDEN 
YIjDRLAEEVNDKLQESGQVT I SELCKTYDLPGNFLTQALTQRIjG 
RI ISGH IDLDNRGVI FTEAFVARHKAR IRGLFSAITRPTAVNSL 

iskygfqeollysvleslvnsgrlrgtwggrqdkavfvpdiys 

^TQSTWVDSFFRQNGYLEFDALSRI^IPDAVSYIKKRYKTTQLL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C^Cysteine, D=Aspartic Acid, E~ 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M^Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FLKAACVGQGLVDQVEASVEEAISSGTWVDIAPLLPTSLSVEDA 
AILLQQVMRAFSKQASTWFSDTVWSEKF\INDCTELFRELMH 
QKAEKSMKNNPVHLITEEDLKQISTLESVSTSKKDKKDERRRKA 
TEGSGSMRGGGGGNAREYKIKKVKKKGRKDDDSDDESQSSHTGK 
KKPEISFMFQDEIEDFLRKHIQDAPEEFISELAEYLIKPLNKTY 
LEWRSVFMSSTTSASGTGRKRTIKDLQEEVSNLYNNIRLFEKG 
MKFFADDTQAALTKHI*LKSVCTDITNLIFNFLASDLMMAVDDPA 
AI TSE IRKKI LSKLSEETKVALTKLHNSLNEKS I EDFI SCLDSA 
AEACDI MVKRGDKKRERQ r LFQHRQAIiAEQLBCVTEDPAL I LHLT 
SVLLFQ FSTHS MLHAPGRC VPQI I AFLNSKI PEDQHALLVKYQG 
LWKQLVSQSKKTGQGDYPLNNELDK3QEDVASTTRKELQELSS 
SIKDLVLKSRKSSVTEE 


5963 


62 


1130 


PWNPQDFPGNRGLMG \QKGE IG P P\GQQGKKGAPGMP \GLMGSN 
GS PGQ PGTPGS KGSKGE PG I QGMPGASGIiKGEPGATGSPGE PG Y 
MGLPG I QGKKGDKGNQGE KG IQGQ KGENGRQG I PGQQG IQGHHG 
AKGERGEKGEPGVRGAIGSKGESGVDGLMGPAGPKGQPGDPGPQ 
GP PG LDG KPGRE FS EQF I RQ VCTD V I RAQIi P VI>LQSGR I RNCDH 
CLSQHGS PGI PGPPGPIGPEGPRGLPGLPGRDGVPGIjVGVPGRP 
GVRGLKGLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGISKEG 
PPGDPGLPGKDGDHGKPG I QGQPGPPGI CDPSLCFS VIARRDPF 
RKGPNY 


5964 


3 


2147 


SCRTRGRLSPLQPREAGSSRGSRARSEPPRPGGMEEACQVQTTK 
RGDPHELRNIFLQYASTEVDGERYMTPEDFVQRYLGIiYNDPNSN 
PKIVQIiIiAGVADQTKDGLISYQEFLAFESVLCAPDSMFIVAFQL 
FDKSGNGE VTFENVKE I FGQT I IHHH I P FNWDCEF I RXHFGHNR 
KKHLNYTEFTQFLQELQI.EHARQAFALKDKSKSGMISGI.DFSDI 
MVTIRSHMLTPFVEENLVSAAGGSISHQVSFSYFNAFNSLLNNM 
FT A/RK T vfiTT .ARTR TCD AKVTlf FFFAO<5 AT R YRfJATPT .PTDT T ,Yn 
LADLYNASGRLTLADIERIAPIaAEGALPYK^^ 
PIWLQIAESAYRFTLGSVAGAVGATAVYPIDLVKTRMQNQRGSG 
S WGEIiMYKNSFDCFKKVI»R YEGFFGLYRGLI PQL I G VAPE KAI 
KLTVKDFVRJJKFTRRDGSVPLPAEVIiAGGCAGGSQVI FTNPLEI 
VORLQVAGEITTGPRVSALNVLRDLGI FGLYKGAJCACFLRDI p 
FSAIYFPVYAHCKLLIiADENGHVGGLNLLAAGAMAG\VPAASl*V 
TPADVIKTRLQVAARAGQTTYSGVIDCFRKIL\REEGPSAFWKG 
TAARVFRSS PQFG\VTI*VTYELLQRGFYIDFGGI»KPAGSEPTPK 
SRIADIiP PANPDH I GG YRIjATATFAG IEN KFGL YIiP KFKSPSVA 
WQPKAAVAATQ 


5965 


1 


1498 


MVTWLYRFIvPTSNMAAiOjRSl^PPDLRLQFWLJiARLQKCF^ 
CGSYCAGAKASPLPGKMAMGLMCGRRELLRLLQSGRRVHSVAGP 
SQWLGKPLTraLLFPAAPCCCKPHYLFLAASGPRSLSTSAISFA 
EVQVQAP P WAATPS P TAVP E VASGETADWQTAAEQS FAELGL 
GSYTPVGLIQNLLEFMHVDLGLPWWGAIAACTVFARCLIFPLIV 
TGQRFAARII1NHLPEIQKFSSRIRFAK1AGDHIEYYKASSEMAL 
YQXKHGI KIjY KPL I LP VTQAP I FI S FF I ALREMANI»P VPS LQTG 
GLW WFQDI»TVS DP I Y I I»PlA\n*ATMWAVIiELGAETG VQS SDLQ W 
MRNVIRMMPLI TLP ITMHFPTAVFM YWLSSNI.FSLVQVSCLR1 P 
AVRTVLKI PQRWHDLDKLPPREGFLES FKKGWKNAEMTRQLRE 
REQRMR^QI*ELAAJRG PLRG/TFTHNPLLQ PGKDNPPNI PSS \SSS 
SSKPKSKYPWHDTLG 


5966: 


102 


1925 


RSKQ VMARLtTKRRQADTKAI QHLWAAI E I IRNQKQIANIDRITK 
YMSRVHGMHPKETTRQLSLAVKDGLI VETLTVGCKGS KAGIEQE 
GYWLPGDE IDWETENHDWYCFECHLPGE VLI CDLCFRVYHS KCL 
SDEFRLRDSSS PWQCPVCRS I KKKNTNKQEMGTYLRFI VSRMKE 
RAI DLNKKG KDNKHPM YRRLVHS AVDVPT I QEKVNEGKYRS YEE 
FKADAQLLLHNTVI FYGADSEQADIARMLYJCDTCHEL\DELQLC 
KNCFYLANARPDNW FCY PCI PNHEIiDWAKMKGFGFWP AKVMQKE 
DNQVDVRFFGHHHQRAWIPSENIQDITVNIHRLHVKRSMGWKKA 
CDELELHQRFLREGRFW KSKNEDRGEEEAES S I SSTSNEQLKVT 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
SVSTQTKKLSASS PR^HRSTQTTNDGVCQSMCHDKYTKI FNDF 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide^ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenvl alanine* R-rlw-iTiA 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
Serine, T=Threonine, V=Valine, 
"^Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




5967 






lOJRMKSDHK^KTERVVREALEKLRSEMEEEK^UAVNKAVAWMQG 
EMDRKCKQVJCE KCKEEFVEE I KiCIJVTQHKQLISQTKKKQWCYNC 
EEEAM YHCCWNTS YCS I KCQQEHWHAEH KRTCRRKR 




5968 


102 


| 1925 


KbKQVMARm'KRRQADTKAIQHLWAAIEHRNQKQIANIDRITK" 
YMSRVHGMHPKETTRQLSLAVKDGLIVETLTVGCKGSKAGIEQE 
GYWLPGDEIDWETENHDWYCFECHLPGEVLICDLCFRVYHSKCI* 
S DE FRLR DS S S P WQCP VCRS I KKKNTN KOFNir tvt dct updmim? 

RAIDIiNKKGKDNKHPMYRRLVHSAVDVPTIQEKVNEGKYRSYEE 
FKADAQ1»LI>HNTVT F YGADSEQAD I ARMLYKDTCHEL \DELQLC 
KNCF YLANAR PDNWFC Y P C I PNH E L.D WAKMKG FG FW P AKVMQKB 
DNQVDVRFFGHHHQRAWIPSENIQDITVNIHRLHVKRSMGWKKA 
CDELELHQRFLREGRFWKSKNEDRGEEEAESSISSTSNEQLKVT 
QE PRAKKGRRNQS VE P K KE EPE PETEAVS SS QE I PTM PQP I EKV 
SVS TQTKKLSASS PRMLHRSTQTTNDGVCQSMCHDKYTKI FNDF 
KDRMKSDHKRETERWREALEKLRSEMEEEKRQAVNKAVANMQG 
EMDRKCKQVKE KCKEEFVEE I KKLATQHKOIj I SQTKKKQWCYNC 
- * v-oj. A.\-vWc#nvynAlirilvRrCRRKR 




5969 


81 


[ 1288 


VRFPRRGGA^PTVKrPGRQQGVFLGPQRPGSEPDIPARGQPHPP ' 

RPVGVSTSAQAQVQPPA^RRRIALGLGFCLLACTSLSVLWVYL, 

ENWLPVSYVPYYLPCPEIFNMKLHYKREKPLQPWWSQYPQPKb 

LEHRPTQLLTLTPWLAPIVSEGTFNPELLQHIYQPLNLTIGVTV 

FAVGN/HFI*ESAEEFFI^Rf?YP\/HWTT?Tr»Kii37*>vTri-»<-.TrT%T 

' c xrcvnxilr I UN P AA V PG VPLG PHRL 

LSSIPIQGHSHWEETSMRRMETISQHIAKRAKREVDYLFCLDVD 

MVFRNPWGPETLGDLVAA1HPSYYAVPRQQFPYERRRVSTAFVA 

DSEGDFY YGGAVFGGQ VARVYEFTRGCHMA 1 IADKAN3 IMAAWR 

EESHIiNRHFISWKPSKVIiSPEYLWDDRKPQPPSIiKLIRFSTLDK 
DISCLRS 




5970 


1126 


503 


WGFNIKRKRCU1,DVFLESPRKPSGRRDRAPEKQRRIAANKCLC 
TGVREGEPPS/TTSQKVKEAGRDFTYLIWLFGISITGGLFYTI 
FKELFSSSSPSKIYGRALEKCRSHPEVXGVFGESVKGYGEVTRR 
GRRQHVRFTEYVKIXSLKHTCVKFYIEGSEPGKCXJTVYAQVKENP 
GSGEYDFRYI FVEIES YPRRTI I IEDNRSQDD 






316 


4712 

I 
I 

I 
C 
I 
1 


SQDN IGHRljijQKHG WKLGQGLGKS LQGR TDP I p I WK YDVMGMG 
RMEMKi.DYAEDATERRRVLEVEKEDTEELRQKYKDYVDKEKAIA 
KALEDLRANF YCEL CDKQYQ KHQE FDNH INS YDHAHKQRLKDIiK 
QREFARNVSSRSRKDEKKQEKALRRI.HELAEQRKQAECAPGSGP 
MFKPTTVAVDEEGGEDDKDESATNSGTGATASCGLGSEFSTDKG 
GPFTAVQITNTTGLAQAPGIASQGISFGI KNNLGTPLQKLGVSF 
SFAKKAPVKLESIASVFKDHAEEGTSEDGTKPDEKSSDQGLQKV 
GDSDGSSNLDGKKEDEDPQDGGSIASTLSKLKRMKREEGAGATE 
PEYYHYIPPAHCKVKPNFPFLLFMRASEQMDGDNTTHPKNAPES 
KKGSS PKPKSCI KAAAS QG AEKTVS E VSEQPKETSMTEPS *PGS 

KAEAKKALGGDVSDQSLESHSQKVSBTQMCESNSSKETSLATPA 

GKESQEGPKHPTGPFFPVLSKDESTALQWPSELUFTKAEPSIS 

YSCNPLYFDFKIiSRNKDARTKGTEKPKDIGSSSKDHLQGLDPGE 

PKKSKEVGGEKIVRSSGGRMDAPASGSACSGLNKQEPGGSHGSE 

TEDTGRSLPSKKERSGKSHRHKKKKKHKKSSKHKRKHKADTEEK 

SSKAESGEKSKKRKKRKRKKNKSSAPADSERGPKPEPPGSGSPA 

PPRRRRRAQDDSQRRSLPAEEGSSGKKDEGGGGSSSQDHGGRKH 

KGELPPSSCQRRAGTKRSSRSSHRSQPSSGDEDSDDASSHRLHQ 

KS PS Q YS EEE EEEDSGS EHS RSRSRS GRRHSSHRS SRRS YSS SS 

DAS S DQ S CYS RQRS YSDDS YSD YSDRSRRHSKRS HDSDDS D YAS 

3KHRSKRHKYSSSDDDYSLSCSQSRSRSRSHTRERSRSRGRSRS 

5 SCSRS RS KR RS RS TTAHS WQRSRS YSRDRSRS TRS PSQRSGS R 

CRS WGHES PEERHSGRRD FI RS KI YRSQS PH YFRSGRGEG PG KK 

JDGRGDDSKATGPPSQNSNIGTGRGSEGDCSPEDKNSVTAKLLL 

JKIQSRKVERKPSVSEEVQATPNKAGPKLKDPPQGYFGPKLPPS 

jGNKPVLPLIGKLPATRKPNKKCEESGIiERGEEQEQSETEEGP 1 * 

JSSDALFGHQFP \SEETTGPIiLDPPPE ES KSGEVTADHPVAPLG 

'PAHFDCYLGDPTlSHNYLPDPSDGNTLESLDSSSQPGPVESSL 

•PIAPDLEHFPSYAPPSGDPS I ESTDGAEDA\sIAPIjBSQPI TP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid • 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=cysteine, D^Aspart ic Acid, B= 
Glutamic Acid, F* Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W tryptophan, Y=Tyrosine, X= Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TPEEMEKYS KLQQAAQQH1 QQQLLAKQVKAFPAS AALAPATPAL* 
QPIHIQQPATASATS ITTVQHAILQHHAAAAAAAIGI HPHPHPQ 
PLAQVHHIPQPHLTPISLSHLTHSI IPGHPATFLASHPIHI IPA 
SAIHPGPFTFHPVPHAALYPTLLAPRPAAAAATALHLHPLLHPI 
FSGQDLQHPPSHGT 


5371 


53 


2149 


SFLYFVGVDMDNP IGNWDGRFDGVQLCS FACVEST I LLHIND 1 1 
PESVTQERRPPKLAFMSRGVGDKGSSSHNKPKATGSTSDPGNRN 
RSS LFYTLNGS S VDSQPQS KS KNTWY I DE VAEDP AKS L»TE I STD 
FDR3SPPLQPPPWSLTTENRFHSLPFSLTKMPNTNGSIGHSPL 
SLSAQS VWEELNTAP VQES P PLAMPPGNSHGLEVG SLAE VKENP 
P F YG VIRW I GQP PGLNEVLAGLELEDE CAG \ CTDGTF / R EGTR Y 
FTCALKKALFVKLKS CR PDS RFAS LQ P VSNQ I ERCNS LAI WEAY 
LSEVVEENTPTQKWEKEGLEIMIG\KKKGIQGHYNSCYLDSTLF 
CXFAFSSVLDTVLLRPKEKNDVE YYSETQELLRTEI VNPLR I YG 
YVCATKIMKLRKI LEKVEAASGFTSEEKDPEEFLNILFHH I LRV 
EPLLKIRSAGQKVQDCYFYQI FMEKNEKVGVPTIQQLLEWS FIN 
SNLKFAEAPSCLI IQMPRFGKDFKLFKKI FPSLELNITDLLEDT 
PRQCRICGGLAMYECRECYDDPD.TSAGKIKQFCKTCNTQVHLHP 
KRU^HKYNPVSIiPKDLPDWDWRHGCIPCQNMELFAVLCIETSHY 
VAFVKYGKDDSAWLFFDSMADRDGGQtJGFNIPQVTPCPEVGEYI. 
KMSLEDLHSLDSRRIQGCARRLLCDAIYVPCTQSPTMSbYK 


S972 


440 


1761 


ILLAGSPSPRDQCSQRQSSGGDKEIiVTRGCTFSTAWSPSAMTQ 
EPFREELAYDRMPTLERGRQDPASYAPDAKPSDLQLSKRLPPCF 
SHKTWVFSVLMGSCLLVTSGFSLYLGNVFPAEMDYLRCAAGSCI 
PSAIVSFTVSRRNANVIPNFQILFVSTFAVTTTCLIWFGCKLVI. 
NPSAININFNLI LLLLLELLMAATVI I AARS SEEDCKKKKGSMS 
DSANI LDEVP FPARVI>KS YS WEVI AGISAVLGG I IALNVDDSV 
SGPHLSVTFFWILVACFPSAIASHVAAECPNKCLVEVLIAISSL 
TSPLLFTASGYt^FSIMRIVEMFKDYpPAIKPSYDVLLLLLLLV 
LLLQA/ G PQHGHRHPVRALQGQC KAAG CILGH P ER PAGAPG WGG 
GQE P PEG VRQG ESLESRRGANGP VTPRRGNRVAAP S LAPGMETH 
NP 


5973 


65 


• 2007 


NGDGKDL FGH I WAWRSNG 1 1 SNFRRS PHAGMAEDE PDAKS P KTG 
GRAP PGGAE AG E PTTLLQRLRGT I S KAVQNKVEG ILQD VQKFS D 
NDKLYLYLQLPSGPTTGDKSSEPSTLSNEEYMYAYRWIRNHLEE 
HTDTCLPKQSVYDAYRKYCESLACCRPLSTANFGKIIREIFPDI 
KARRLGGRGQSKYCYSGIRRKTLVSMPPLPGLDLKGSESPEMGP 
E VTPAPRDELVEAACALTCDWAER I LKRS FSS I VEVARFLLQQH 
LI S ARS AHAHVLKAMGLAEEDEHAPRERS SKPKNGLEMPEGGAH 
KKPERLAQPPKDLEARTGAGPLARGERKKSWESSAPGANNLQV 
NALVARLPLLL PRAP RS L I P P I PVS PP I LAPRLS SGALKVATLP 
LSSRAGAPPAAVP I INM I LPT VPALPG PGPGPGRAPPGGLTQPR 
GTENREVG IGGDQGPHDKGVKRTAEVP VS E ASGQAP P AKAAKQD 
I EDTAS DAKRKRGRPLKKSGGSGERNS TPLKSAAAM ESAQS S RL 
PWETWGSGGEGNSAGGAERPGPMGEAEKGAVLAQG\QGDGTVSK 
GGRGPGS QHTKEAEDKI PLVPS KVSVI KGSRSQKEAFPLAKGEV 
DTAPQGNKDLKEHVLQSSLSQEHKDPKATPP 


5974 


4293 


2200 


LGLQMHTTSGRIHQAMVTSLNEDNESVTVEWIENGDTKGK\EID 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV\ AS IKNDP PS \ RDNRWGS ARARPSQFPEQFSS AQQNGS V\ S 
D IS P VQAAKKE FG PPS RR KSNC VKEVE KLQE KREKRRLQQQELR 
E KRAQDVDATN PNYE I MCMIRDFRGSLDYRPLTTADP IDEHRIC 
VCVRKRPLNKKETQMKDLDVITI PSKDWWVHEPKQKVDLTRYL 
ENQTFRFDYAFDDS APNEMVYRFTARPLVET I FERGMATCFAYG 
QTGSGKTHTMGGDFSGKNQDCSKGIYALAARDVFIiMLKKPNYKK 
LELQVYATFFEI YSGKVFDLLNRKTKLRVLEDGKQQVQWGLQE 
REVKCVEDVLKLIDIGNSCRTSGQTS ANAHSSRSHAVFQ 1 1 LRR 
KGKLHGKFS L I DLAGNERGADTSS ADRQTRLEGAEIN KSLLALK 
EC IRALGRNKPHTPFRAS KLTQVLRDSF IGENS RTCMIATI SPG 
MASCENTI^TLRYANRVKELTVDPTAAGDVRPIMHHPPNQI \DD 
LETQWGVGSS PQRDDLKLLCEQNEEEVSPQLFTFHEAVSQMVEM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of: 
amino acid 
sequence 


Amino acid segment containing signal peptide "" 
(rt-Aidjjine, t — cysteine , UoAspartic Acid/ E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Hiotidine, I=Iaoleucine, K-Lysine, 
L= Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T-Threonine, V=Valine, 
^-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EEQVVEDHRAVFQESIRWLEDEKALLEMTEEVDYDVDSYATQLE 
AILEQKIDILTELRDKVKSFRAALOEEEQASKQINPKRPRAL 


5975 


4293 


2200 


LGLQMHTT SGR I HQAMVTS LN E DNES VT VEWI ENGDTKGK \ E I D 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV\AS I KNDPPS \RDNRVVGSARARPSQFPEQFSSAQQNGSV\S 
DISPVQAAKKEFGPPSRRKSNCVKEVEKLQEKREKRRLQQQELR 
E KRAQD VDATNPNYE I MCM I RD FRGSLDYRPLTTADP IDEHRI C 

* ixxvivr J-ji^i rv XV £j X \Jl r i I\U LjU v JL 1 X f 5»lvUV VMVriEPKQKVDLTRYL 

ENQTFRFD YA FDDS APNEM V YR FTAR PL VETI FERGMATCFA YG 
QTG3GKTHTMGGDFSGKNQDCSKGIYALAARDVFLMLKKPNYKK 
LELQVYATFFE I Y SG KVFD I iT»N R KTKLRVL EDGKQQ VQ WGLQE 
REVKCVEDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQIILRR 
KGKLHGKFSLIDLAGMERGADTSSADRQTRLEGAEINKSLLALK 
ECXRALGRNKPHTPFRASKLTQVLRDSFIGENSRTCMI ATI S PG 
MASCENTLNTLRYANRVKELTVDPTAAGDVRPIMHHPPNQI\DD 
LETQWGVGSS PQRDDLKLLCEQNEEEVS PQLFTFHEAVSQMVEM 
EEQWEDHRAVFQES IRWLEDEKALLEMTEEVDYDVDS YATQLE 
AILEQKID ILTELRDKVKSFRAALQEEEQASKQIN PKRPRAL 


5976 


20 


2949 


VHHLHLTRVSVWNLDIILRIAQQMGIKTLNLVLG\LKRA\LEF 
P E VS WME V KD PNMKGAMLTNTGKYAI PTI DA\ EAYAIG KKEKP P 
FLPEEPSSSSEEDDP I PDELLCLI CKDIMTDAWT PCCGNS YCD 
ECIRTALLESDEHTCPTCHQNDVS PDAL I ANKFLRQAVNNFKNE. 
TGYTKRURKQLPSPPPP I PPPRPLIQRNLQPLMRS PI SRQQDPL 
MIPVTSSSTHPAPSISSLTSNQSSLAPPVSGNPSSAPAPVPDIT 
ATVSISVHSEKSDGPFRDSDNKILPAAALASEHSKGTSSIAITA 
LMEEKGYQVPVLGTPSLLGQSLLHGQLXPTTGPVRINTARPGGG 
RPGW EHSN K LG YL VS P PQQ I RRGERS CYRS I NRGRHHSE RS QRT 
CGPSLPATPVFVPVPPPPLYPPPPHTLPLPPGVPPPQFSPQFPP 
GQP\PPAGYSV?PPGFPPAPANLSTPr7VSSGVQTAHSNTIPTTQ 
APPLSREEFYREQRRLKEEEKKKSKLDEFTNDFAKELMEYKKIQ 
KERRRS FSRSKS P YSG SSYSRSS YTYS KSRSGS TRSRS YSRS FS 
RSHSRSYSRSPPYPRRGRGKSRJNYRSRSRSHGYHRSRSRSPPYR 
RYHSRSRSPQAFRGQSPKKRNVPQGETEREYFNRYREVPPPYDM 
KAYYGRSVDFRDPFEKERYREWERKYREWYEKYYKGYAAGAQPR 
PSANRENFS PER FT. PLNIRNSPFTRGRREDYVGGQSHRSRN I GS 
NYPEKLSARDGHNQKDNTKSKEKESEWAPGIX3KGWKHKKHRKRR 
KGEESEGFLNPELLETSRKSREPTGVEENKTDSLFVLPSRDDAT 
PVRDE PMDAES ITFKSVSEKDKRERDKPKAKGDKTKRKNDGSAV 

TILNHHLPLRRMKKSL\ EP P\ EKLTLNQQK\TPRNKTSQRGKS E 
EGLFQRCQIRKANN 


5977 


1363 


1336 


FLEDRGQVLSHFQCLSLHS INHI LHPGAGVAAGPATGW/RE YLT ' 
PVLKES KFKETGV I TPEEFVAAGDHLVHHCPTWQWATGEELKVK 
AYLPTGKQFLVTKN VPCYKRCKQME YSDE LEAI I EEDDGDGG WV 
DTYHNTG I TGI TEAVKE I TLENKDNI RLQDCSALCE EE EDEDEG 
EAADMEEYEESGLLETDBATI^TRKIVEACKAKTDAGGEJDAILQ 
TRTYDLYITYDKYYQTPRLWLFGYDEQRQPLTVEHMYEDISQDH 
VKKTVTI ENHPHLPPPPMCSVHPCRHAEVMKKI IETVAEGGGEL 
GVHMYLLI FLKFVQAVIPTI EYDYTRHFTM 


5978 


160 


3213 


RDGARRWGGCQSPLTWAPGFYRRFDLATSGRRLRGQTAEPAGRQ 
RPRREPEAMDEQSVESIAEVFRCFICMEKLRDARLCPHCSKLCC 
FSCIRRWLTEQRAQCPHCRAPLQLRELVNCRWAEEVTQQLDTLQ 
LCSLTKJfEENEKDKCENHHEKLSVFCWTCKKCICHQCAljWGGI^IK 
GGHTFKPLAEIYEQHVTKVNEEVAKLRRRLMELISDVQEVERNV 
EAVRNAKDER VRE I RNAVEMM IARLDTQLKNKLITLMGQKTSLT 
QETBLLESLLQE VEHQLRSCS KSELIS KSSEI LMMFQQVHRKPM 
AS FVTTPVP PDFTS ELVPS YDS ATFVLENFSTLRQRADP VYS PP 
LQVSGLCWRLfCVYPDGNGWRG YYLS VFLELS AGLPETS KYE YR 
VEMVHQS CNDPTKNI I RE FAS DFE VGECWG YNR FFRLDLLANEG 
YLNPQN0TVII^FQVRSPTFFQKSRDQHWYITQLEAAQTSYIOQ 
INNLKERLTIELSRTQKSRDLSPPDNHLSPQNDDAI.ETRAKKSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine , T=Threonine, V=Valine, 
W=Tryptophan , Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








CSDMLLER\G PYSAS \VREAKEDEEDEEKIQNEDYHHEI>SDGDI* 
DLDLVYEDEVNQLDGSS55ASSTATSNTEEND I DEETMSGENDV 
EyNIWEXrEEGELMEDAAAAGPAGSSHGyVGSSSRISRRTHLCSA 
ATSSLIiDIDPLILIHLLDLKDRSS I ENLWGLQPRPPASLLQPTA 
S YSRKDKDQR KQQAMWRVPSDLKMLKRliKTQMAE VR CMKTDVKN 
TLSE I KSSSAASGDMQTSLFSADQAALAACGTENSGRLQDLGME 
LLAKSS VANCYI RWSTNKKSNS PKPARS SVAGSLSLRRAVDPGE 
NSRSKGDCQTX.SEGSPGSSQSGSRHSSPRALIHGSIGDILPKTE 
DR QCKALDS DAVWAVFSG LPAVEKRRKMVTLG ANAKGGHLEGL 
QMTDLENNSETGELQPVLPEGASAAPEEGMSSDSDIECDTENEE 
CEEHTSVGGFHDSFMVMTQPPDEDTHSSFPDGEOIGPEDLSFNT 
DENSGR 


5979 


212 


3665 


LPDM'1*MYL»WIjKLLAFGFAFLDTEVFVTGQSPTPSPTDAYLNASE 
TTTLSPSGSAVISTTTIATTPSKPTCDEKYANITVDYLYNKETK 
LFTAKLNVNENVECJGNNTCTNNEVHNLTECKNASVS ishnscta 
PDKTLI LDVPPGVEKVPVHCCS \QVEQPDS T I WLKWKNI ETSTC 
DTQNI TYRFQCGNMI FDNKEIKIiENLE PEHEYKCDSEILYNSHK 
FTNASKI IKTDFGSPGEPQI I FCRSEAAHQGVITWNPPQRS FHN 
FTLCYI KETEKDCIiNLDKNLIKYDLQNLKPYTKYVLSLHAY 1 1 A 
KVQRNGSAAM CHFTTKSAP PSQVWNMTVSMTS DNS MHVKCR P PR 
DRNGPHERYHLEVEAGNTLVRNESHKNCDFRVKDLQYSTDYTFK 
AYFHNGDYPGEPFILHHSTSYNSKALIAFLAFLI IVTSI ALLW 
LYKI YD LHKKRS CNLDEQQELVE RDDE KQLMNVEP I HAD I LLET 
YKRKIADEGRLFIoAEFQS I PRVFSKFP I KEARKPFNQNKNRYVD 
ILPYDYNRVELSEINGDAGSNYINASYIDGFKEPRKYIAAQGPR 
DET VDDFWRM 2 WEQKATVI VM VTRCEEGNRNKCAE YWPSMEEGT 
RAFGECCCKDLTKHKRCP\DYI IQKLN I VNKKEKATGREVTHI Q 
FTSWPDHGVPEDPHIiIiLKLRRRVNAFSNFFSGPIWHCSAGVGR 
TG'rYIGI DAMLEGLEAENKVDVYGYWKLRRQRCLMVQVEAQY I 
IjIHQALVEYNQFGETEVNLS ELHP YjLHNMKKRDP PS E PS PLE AE 
FQRLP S YRSWRTQHI GKQE \ ENKSKNRNSNVI P YDYNRVP LKHE 
LEMSKESEHDSDESSDDDSDSEEPSKYINASFIMSYWKP\EVMI 
AAQGPI*KETIGDFWQMIFQRKVKVIVM1.TE1»KHGDQEICAQYWG 
EGKQTYGDIEVDLKDTDKSSTYTLRVFELRHSKRKDSRTVYQYQ 
YTNWSVEQLPAEPKELISMIQVVKOKLPQKNSSEGNKHHKSTPL 
LIHCRDGSQQTGIFCALLNLLESAETEEWDIFQWKALRKARP 
GMVSTFEQ YQFI»YD VI ASTYPAQNGQVKKNNHQED KI E FDNEVD 
KVKQDANCVNPI»GAPEKIiP EAKEQAEGS EPTSGTEG PEHSVNG P 
ASPALNQGS 


5980 


3 


2363 


DAWGCKLRRLR FT YGTQTR VS LALPGQ Y EL VHTLVAHQGNWET I 
PEEDLEVQENNEDAAHDLTELEVTMHHAIiliQEVDVVVAPCQGIiR 
PTVDVLGDLVOT>FIiPVITYAIjHKDELSERDEQEIiOEIRKYFSFP 
VFFFKVPKLGSE I IDSSTRRMESERSPLYRQDIDLGYLSSSHWN 
CGAPGQDTKAQSMLVEQSEKLRHI.STFSHQVLQTRI.VDAAKALN 
LVHCHCLDIFIKQAFDMQRDLQITPKRLEYTRKKENELYESLMN 
IANRKQEEMKDMIVETLNTMKEELLDDATNMEFKDVIVPENGEP 
VGTREIKCCIRQIQEI»I ISRLNQAVANKLI SSVDYLRES FVGT1» 

erci^sleksqdvsvhitsnylkqilnaayhvevtfhsgssvtr 
mlweqikqi iqr itwvsppaitlewkrkvaqeaieslsas klak 
S I csqfrtrlnssheafaaslrqleaghsgrlektedlwlrvrk 

WGGHFPCAliKSWPPDEKWfl^lJu^FHYMRSLPKHERLVDLKG 
SVIDYNYGGGSSIAVLLIMERLHRDLYTGLKAGLTLETRLQIAL 
DWEG IRFLHSQGIiVHRDIKLKNVLLDKQNRAKITDLGFCKPEA 
MMSGS I VGTP1 HMAPEIiFTGKYDNS VDVYA FG I LFW Y I CSGS VK 
IiPEAFERCAS KDHL WNNVRRGAR PERLP VFDEECWQLMEACWDG 
DPLKRPLLGIVQPMLQGIMNRLCKS\NSEQPNRGLDDST 


5981 


1 


2519 


GRRHSAAMERPWGAADGLSRWPHGLGLLLLLQLLPPSTLSQDRI. 
DAPPPPAAPI*PRWSGPIGVSWGLRAAAA\GGAFPRGGRWRRSAP 
G\EDEECGRVRDFVAKLANNTHQHVFDDLRGSVSI»SWVGDSTGV 
ILVLTTFHVPLiVIMTFGQSKLYRSEDYGKNFKDITDIilNNTFIR 
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f SEQ 
ID 
KTO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptid~ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidinc, I^Isoleucine, K=I*ysine, 
L=Leucine f M=Methionine, N=Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5982 






TEFX3MAIGPEKSGKVVLTAEVSGGSRGGRIFRSSDFAKNFVQTD 
LPFHPLTQMMYSPQNSDYLIALSTENGLWVSKNFGGKWEEIHKA 
VCLAKWGSDNTIFFTTYANGSCKADLGALELWRTSDjLGKSFKTI 
GVKIYSFGIiGGRF^.FASVMADKDTTRRIHVSTDQGDTMS^4AQLP 
SVGQEQFYS ILAANDD.WFMHVDEPGDTGFGTI FTSDDRG I VYS 
KS LDRHLY TTTGGE TD FTNVTSLR fiVY 7 T Q VT .C •!? nw c rr\-PM t rr» 

DQGGRWTHLRKPEMSECDATAKNKNECSLHIHASYSISQKLNVP 
MAPLSEPNAVGIVIAHGSVGDAISVMVPDVYISDDGGYSWTKML 
EG PHY YT I JjDSGG 1 1 VAI EHS S R P I NVI KFS TDEGQCWQTYTFT 
RDPI YFTGLASEPGARSMNIS IWGFTES FLTSQWVS YTI DFKDI 
LERNCEEKDYTIWLAHSTDPEDYEDGCILGYKEQFLRLRKSSVC 
QNGRDY WTKQPS I CLCS L ED FLCDFG Y YR P ENDS KCVE QPELK 
GHDbEFCLYGREEHl»TTNGYRKIPGDKCQGGVNPVREVKDLKKK 
CTSNFI^PEKQNSKSNSVPIII^IVGLMLVTVVAGVLIVKKYVC 

GGRFLVHI»YSVLQQH\AEA\NGVDGVDATjDTASHTNfCSaYHDDS 
DEDIiLE 


5983 


56 


2316 


ATR? PRGS £> WCRQFSRTASAA PGRSNMLR I PVRKALVGLS KS PK~ 
GCVRTTATAASNLI E VFVDGQS VMVEPGTT VJjQ ACE KVGMQI PR 
FC YHERJjS VAGNCRMCLVEI EKAPKWAACAMPVMKGWNI LTNS 
E KS KKAREGVM E FLkANH PLDCP I CDQGGECD LQDQSMM FGNDR 
SRFLEGKRAVEDKNIGPLVKTIMTRCIQCTRCIRFASEIAGVDD 
LGTTGRGNDMQVGTYIEKMFMSELSGNIIDICPVGALTSKPYAF 
TARPWETRKTESIDVMDAVGSNIWSTRTGEVMRILPRMHEDIN 
cn. vf j-oua i Kr AXIAjjjiUiQRUTEPMVRNEKGlxLT YTSWEDALSRV 
AGMl^SFQGKDWVAIAGGLVDAEALVALKDLLNRVDSDTLCTEE 
VFPTAGAGTDIjRSN YI>LN TTIAG VEEAD VVL Jj VGTNFR FE APL F 
NAR^RKS WLHNDLKVALIGSPVDIaT YTYDHLGDS PK1 LQDI ASG 
SHPFSQVLKEAXKPMVVLGSSALQRNDGAAI1AAVSS1AQKIRM 
TSGVTGDWKVMNILHRJJ^QVAALDLGYKPGVEAIRKNPPKVI.F 
LLGADGGG1TRQDLPKDCFIIYQGHHGDVGAPIADVILPGAAYT 
EKSATYVNTEGRAQQTKVAVrPPGLAREDWKI IRADSE1AGMTL 
PYDTL \ DQ VRNRLEEVS PNLVR YDD JEG \ AN YFQQANEIiS KTiVN 
QQLLADPLVPPQLTMKDFYMTDS I SRASQTMAKCVKAVTEGAQA 
VEEPSIC 


I 5984 


248 


1763 


EARGDGGRRRHRASGRRAGRGEP\AGLKSQGQRAVPKRAVARGG 

RO\ YSAAIAXjTtFPAnCIT? T innr.CTT VOMDTV -*/->T7T 1/nmk>nnnnrn 

w * * uwixnijujir nu^ a i-^uuoi iii jj^KAAL YLK2GNCSGCIQ 
DCNRAL3LH PFSMKPLLRRAMAYETLEQYGKAY VD YKTVLQIDC 
GLQIiANDSVNRLS RILMELDG PNWREKLSLrl PAVPASVPLQAWH 
PAKEMISKQAGDS SSHRQQGI TDEKTFXAIiKEEGNQCVNDKNYK 
DALS KYS ECLKINNKECA I YTNRALCYLKr.rnFFFa. X-nnrrvo a t 

QLADGNVKAFYRRALAHKGLKNYQKS LIDLNKVI LDDPSI I EAR 
ME^EEVTRLLNIjKDKTAPFNKEKERRKI e iqe vnegkeepgrpa 
GEVS TGCLAS E KGG KS SRS PED PEKL P I AKPNNAYEFGQI INAI» 

STRKD KEACAHIjIiAI tapkdlpmflsnklegdt flll iqslknn 

LIEKD PS LVYQHLL YLS KAER F KMMLTL I S KGQKEL I EQLFEDli 
SDTPNNHFTLEDIQALKRQYEL 


5985 


755 


1193 


SSVC^CTWSNLGKKQRSVSFI^GLMRVSTGPELRLHHSFVL- 
TGDVGRRICRLLVGLFTKGDTSSKRVHPFSPGPCFI.LCDLARVG 
S S PK INVS P F YQN \QTSTQRS CTVF VWQRCSLVG P FQVTVFTM Y 
FHHSLRSISRFSSG 




22 


1408 

< 
1 


RRVARPGTAEPAKARRTVRRGRARRDLAGAERKAGVSERGDSGR 
RRPNPS IPSAAAGMSHIQI PPGjLTELLQGYTVEVLRQQPPDLVE 
FAVEYFTRLREARAPASVLPAATPRQSLGHPPPEPGPDRVADAK 
GDSESEEDEDLEVPVPSRFNRRVSVCAETYNPDEEEEDTDPRVI 
HPKTDEQRCRLQFACKDILLFKWLDQEQLSQVLDAMFERIVKAD 
EHVI DQGDDGDNF YVI ERGT YD I LVTKDNQTRS VGQYDNRG S FG 
ELAIjM YNTPRAAT I VATSEGS LWGLDR VTFRR 1 1 VKNNAKfCRKM 
FESFIESVPLLKSLEVSERMKIVDV1GBKIYKR/DGERIITQGE 
KXADSFYIISSGEVSILIRSRTKSNKDGGMQEVEIARCHKGQYF 
3EIJU^VTNKPRAASAYAVGDVKCLVMDVO^FERI,IiGPCMDIMKR 
•41 SHYEEQLVKM FGSSVDLGNLGQ 
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SSQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, K-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y= Tyrosine, x=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5986 


1806 


484 


DAWKSTS I/TFHWKLWGRHRGRRRGLAH P KNHLS PQQGG AT PQ V P 
S PCCRFDSPRGPPPPRU3I>LGALMAEDGVRGS PPVPSGPPMEED 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
LI SNVCS I GDHVAQEI»FQGSDLGMAEEAERPGEK\AGQHSPI J RE 
EHVTCVQS I I»DEFLQT\ YGS L I PLSTDE WEKLED I FQQE FST P 
S RKGbVLQL I QS YQRM PGNAMVRG FRVAYKRHVLTMDDLGTLYG 
QNWLNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKIiRTKGYDG 
VKRWTKNVDIFNKELLLIPIHL.EVHWSLISVDVRRRTXTYFDSQ 
RTLNRRCP KH I AKYLQAEAVKKDRIiDFHQGWKGYFKMNVARQNW 
DSDCGAFVIiQYCKHIjAIjSQPFSFTQQDMPK1*RRQIYKELCHCKIj 
TV 


5987 


1806 


484 


DAWKSTSLTFHWKLWGRHRGRRRGLAHPKNHLSPQQGGATPQVP 
SPCCRFDSPRGPPPPRLGLLGALMAEDGVRGSPPVPSGPPMEED 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
hi SNVCS IGDHVAQELFQGSDLGMAE5AERPGEK\AGQHS PLRE 
EHVTCVQS ILDEFLQT\ YGSLI PLSTDEWEKLEDIFQQEFSTP 
SRKGLVLQLI QS YQRM PGNAMVRG FRVAY KRHVLTMDDLGTLYG 
QNWLNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKLRTKGYDG 
VKRWTKNVDI FNKELLLI PIHLEVHWSLIS VDVRRRTITYFDSQ 
RTLNRRCPKH IAKYLQAEAVKKDRLD FHQGWKGYFKMNVARQNN 
DSDCGAFVLQYCKHIAIiSQPFSFTQQDMPKLRRQIYKEIiCHCKIi 
TV 


5988 


1292 


410 


FKKYFLSFIX3LLESSHSRDRIHNLVLMFLLATHNLVWWFTCRFQ 
RLDCI YLNAG IMPNPQLNI KALLFGLFS \ AEGLLTQGDKI TADG 
LQE VFETDVFGHF I LIREliE P LLCHSDNPSQIi I WTS SRNARKSN 
FSLED FQHS KGKE P YSS S KYATDLLS VALNRNFNQQGLYSNVAC 
PGTALTNIjTYG I LPPFI WTLLMPAI LliLRFFANAFTLTP YNGTE 
ALVVJLFHQKPESIjNPIjIKYLSATTGFGRNYIMTQKMDLDEDTAE 
KFYQKUjBLEKH I RVTI QKTDNQARLSGS Cl» 


5989 


194 


2610 


AMDFPQHSQHVLEQLNCX)RQLGI*LCDCTFVVDGVHFKAHKAVLA 
ACSE YFKMIiFVDQKDWHLDI snaaglgqvlbfm YTAKLSLS pe 
nvddvlXavatflqmqdiitachalkslaepatspggnaeaiat 
eggekrakeekvatstlsrleqagrstpigpsrdlkeerggqaq 
saasgaeqtekadaprepppvelkpdptsgmaaaeaeaalsess 
eqemeveparkgeeeqkeqeeqeeegagpaevkeegsqlengea 
peeneneesagtdsgqelgsearglrsgtygdrteskaygsvih 
kcedcg kefthtgnfkrh 1 r ihtgekp fscrecskafsdp aack 

AHEKTESPLKPYGCEECGKSYRLISLLNIjRKKRHSGEARYRCED 
CGKLiFTTSGNLKRHQLVHSGE KP YQCD YCGRSFSD PTS KMRHLE 
THDTDKEHKCPHCDKKFNQVGNLKAHLKIHIADGPLKCRECGKQ 
FTTSGNLKRHLRIHSGEKPYVCIHOQRQFADPGALQRHVRIHTG 
EKPCQCVMCGKAFTQASSLIAHVRQHTGEKPYVCERCGKRFVQS 
SQLANHIRHHDNIRPHKCSVCSKAFVNVGDLSKHI I IHTGEKPY 
LCDKOGRGFNRVDNLRSHVKTVHQGKAG I KILEPEEGSEVS WT 
VDDMVTIiATEAliAATAVTQLTVVP VGAAVTADETEVIiKAEI S KA 
VKQVQEEDPNTHILYACDSCGDKFLDANSIiAQHVRIHTAQALVM 
FQTDADFYQQYGPGGTWPAGQVLQAGELVFRPRDGAEGQPALAE 
TSPTAPECPPPAE 


5990 


2 


4700 


FGPGPDSGGGARGSGWGSRSQAPYGTLGAVSGGEQVLLHEEAGD 
SGFVSLSRLGPSXJRDKDLEMEELMLQDETLLGTMQSYMDASIiIS 
LIEDFGSLGEVEMSLPDPSWDFSPPSFLETSSPKLPSWRPPRSR 
PRWGQ3 P P PQQRSDGEEEEEVAS FSGQILAGELDNCVSS I PDFP 
MHLACPEEEDKATAAEMAVPAAGDES I SSLSELVRAMHPYCLPK 
IiTHLAS LEDE LQEQPDDLTLPEGCWLEI VGQAATAGDDLEI PV 
VVRQVSPGPRPVIJJDDSLETSSAIiQLLMP'rLESETEAAVPKVTL 
CSEKEGLSLNSEEKLDSACLLKPREWEPWPKEPQNPPANAAP 
GSQRARKGRKKKSKEQPAACVEGYARRLRSSSRGQSTVGTEVTS 
QVDNLQKQPQEELQKESGPIiQGKGKPRAWARAWAAALENSSPKN 
LERSAGQSS PAKEGPLDLYPKTiADTIQTNPI PTHLSIiVDS AQAS 
PMPVDSVEADPTAVGPVLAGPVPVDPGIjVDIiASTSSEIiVEPIiPA 
EPVLINPVLADSAAVDPAWPISDNLPPVnAVPSGPAPVDLALV 
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SEQ "" 
ID 
NO: 


freaicceci 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K~Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DP VPNDLTP VDP VX»VKS R PTDPR RGAVS S ALGGS A POIj L VES E S 
LDPPKTIIPEVKEWDSLKIESGTSATTHEARPRPLSLSEYRRR 
RQQRQAETE K KS PQ P PTGKWPS L P ET PTGLAD I P CL. V I P PAPAK 
KTALQRSPETPLEICLVPVGPSPASPSPEPPVSKPVASSPTEQV 
PSQEMPLLARPSPPVQSVSPAVPTPPSMSAAliPFPAGGLGMPPS 
LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCYPHVSPSGYP 
CLPPPPTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSPYSSTCTY 
GPU3WGPGPQHAPFWSTVPPPPLPPASIGRAVPQPKMESRGTPA 
GPPENVLPLSMAPPLSLGLPGHGAPQTEPTKVEVKPVPASPHPK 
y !KV S ALVQS PQMKAIiACV SAEGVTVEEPASERLKPETQETRPR E 
KPPLPATKAVPTPRQSTVPKLPAVHPARLRKLSFLPTPRTQGSE 
DWQAFISEIGIEASDLSSLLEQFEKSEAKKECPPPAPADSLAV 
GNSGGVDI PQEKRPLDRLQAPEL»ANVAGLT PPATPPHQLWKPLA 
AVSLIAKAKS PKSTAQEGTLKPEGVTEAKI I PAAVRLQEGVHGPS 
RVHVG3GDHDYC\VRSRTPPKK\MPAIiLIPEVGSRWNVKRHQDI 
TIKPVLSLGPAAPPPPCIAASREPLDHRTSSEQADPSAPCLAFS 
SLLS PEAS P CRNDMNTRTPPEPSAKQRSMRC YRKACRSAS PSSQ 
GWQGRRGRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 
HKRWRRSSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 
PSPRRRSDRRRRYSSYRSHDHYQRQRVLQKERAIEERRWFIGK 
IPGRMTRSELKQRFSVFGEIEECTIHPRVQGBNYGFVTYRYAEE 
AFAAIESGHKLRQADEQPFDLCFGGRRQFCKRSYSDLDSNREDF 
DPAPVKSKFDSLDFDTLLKQAQKNLRR 


5991 


334 


1379 


RLSSHFSQCSPSIYC\TKFDKQGNVTSFERKKTELYQELGLQAR 
DiiRFQHVMS I TVRNNR I 1 MRME YI»KA VI TP ECLL.I IJD YRNLNLK 
QWLFRELPSQLSGEGQLVTYPLPFEFRAIEALLQYWINTLQGKI. 
SILQPLILETLDALGDPKHSSVDRSKiiHILLQNGKSLSELETDI 
.KIFKESILEILDEEELLEELCVSKWSDPQVFEKSSAGIDHAEEM 
ELLLEN YYRItADDLSNAARELRVL IDDSQS 1 I FINLDSHRNVMM 
RLNLQI*TMGTFS LSLFGLMGVAFGMNTjES S LEEDHR I FWLI TG I 
MFMGSGIiI WRRLIiS FLGR/LARSS IAS YGMKDMVHGGI VEGL 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPLRPWNGAMEKLRRVL 
SGQDDEEQGLTAQDSQINIi/SEVLDASSLSFNTRLKWFAICFVC 
GVFFS I LGTGLLWLPGG I KLFAVFYTLGNLAALASTCFLMGPVK 
QLKKMFBATRLIATIVMLLCFIFTLCAALWWHKKGLAVLFCILQ 
FLSMTWYSLS YI PYARDAVI KCCSSKLS 


5993 


1650 


594 


AEGLGS WAVWAGLG WAGRHMEAGGATGAJjGVGC KLPSAF C FPGS 
SVAMDMFQKVEKIGEGTYGWYKAKNRSTGQLVALKKIRLDLEM 
EGVPSTAIREISLLKELKHPNIVRLLDWHNERKLYLVFEFIiSQ 
DLKjECYI4DSTPGSELPIJII,IKSYLFQLLQGVSFCHSHRVIHRDrjC 
PQNLLINELGAIKIJu^FGLAPJ^FGVPLRTYTHEVVTIjWYRAPEI 
LLATRFYTTAVDI WS IGCI FAEMVTRKALFPGDS \EIDQ \ LFRI 
FRMLGTPSEDTWPGVTQLPDYKGSFPKWTRKGLEEIVPNLEPEG 
RDLLMQbLQYDPSQRITAKTALAHPYFSSPEPSPAARQYVLQRF 
RH 


5994 


394 

• 


1934 


AGEVQLHVWI RGMRIQPQ/KAAAI IDLDPDFEPQSRPRSCTWPL - 
PR P E I ANQPS KPPEVEPDLGE KVHTEGRSEP I LI»PSRLPE P AGG 
PQPGILGAVTGPRKGGSRRNAWGNQSYAELISQAIESAPEKRLT 
LAQI YEWMVRTVPYFKDKGDSNSSAGWKNS IRHNLSLHS KF IKV 
HNBATGKSSWWMLNPEGGKSGKAPRRRAASMDSSSKLLRGRSKA 
PKKKPSGl>PAPPEGATPTSPVGHFAKWSGSPCSRNREEADMTiTT 
FRPRSSSNASS VSTRIiS PIiR PFcswvT.AF T^T Pa cvcc varvtrDrvt* 

LNEGLEIiLDGLNLTSSHSLLSRSGI*SGFSLQHPGVTGPI,HTYSS 
SLFS PAEGPLSAGEGCFSSSQALEALLTSDTPP PPADVLMTQVD 
PI LS QAPTLLLLGGLPS S S KLATG VGLCPKPLE APG PSS I*VP TL 
SM I AP P P VMAS AP I PKAJLGTPVLTP PTEAASQDRMPQDLDIjDMY 
MENLECDMDNI ISDLMDEGEGLDFNFEPDP 


5995 


2 


2437 


RPPGPGPASGAWI#CTRARGSAAFVPPI»PRPPSRGARRRRRIjPGR "" 
GVAAIiRRGPGSAPGLPRGRAERSAAGSGRGPSREERGAAAAAAA 
AEMMEEIiHSL \ DP \RRQEJbLEARF\ TGIiGVSKGPLNSESSNQS Li 
CSVGSLSDKEVETPEKKQNDQRNRKRKAEPYETSQGKGTPRGHK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=Histidinc, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, v=valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ISDYFERRVEQPLYGLDGSAAKEATEEQSALPTI^SVMLAKPRL 
DTEQLAQRGAGLCFTFVSAQQNSPSSTGSGNTEHSCSSQKQI S I 
QHRQT\QSDLTIEKISALENSKNSDliEKKEGRIDDLLRANCDLR 
RQI \DEQQKMLEKYK\ERLNRCFDNEPRNFLI EKS KQEKMACRD 
JwMyUKJjK.Livj.rLr 1 1 VKmaAir 1 W x ukj lAf yNJjJ. KQQERINSQ 
REEIERQRKMLAKRKPPAMGQAPPATNEQKQRKSKTWGAENETL 
TLAEYHEQEE I FKLRLGHLKKEEAE IQAELERLERVRNLHIREb 
KR1HNEDNSQFKDHPTLNDRYLLLHLLGRGGFSEVYKAFDLTEQ 
RYVAVKIHQLNKNWRDEKKENYHKHACREYRIHKELDHPRIVKL 
YDY FSLDTDS FCTVLE YCEGNDLDFYLKQH KLMS E KE ARS 1 IMQ 
I VNALKYLNEI KPP I IH YDLKPGNI LLVNGTACGE I Kl TDFGLS 
KIMDDDSYNSVDGMELTSQGAGTYWYLPPECFWGKEPPKISNK 
VDVWSVGVIFYQCLYGRKPFGHMQSQQDILQENTILKATEVQFP 
PKPVVTPEAKAFIRRCLAYRKBDRIDVQQLACDPYLLPH I RKS V 


5996 


1612 


981 


DQQACLIjGLMLTLE FG I LEFDPS W I GSWTQR/ S W VS WRSRPGCE 
LFS I WFG S I VNEG YLNS ASEGE E FC I YNRNPNACS YG VAVGVIi 
AFLTCLLYLALDVYFPQISSVKDRKK\AVI^GHPWSGEPHPAA 
FWAFLWFTGDS CYL\ANQWQVS KPKDNPLNEGTDAS PGRPSPFS 
FFSI FTWSLTAALaVRRFKDLSFQEEYSTLFP\ASAQP 


5997 


1612 


981 


DQQACLLGLMLTLEFG ILE FDPSVJ IGS WTQR/ S WVSWRSRPGCE 
LFS I VVFGS I VNEGYLNSASEGEEFCI YNRNPNACS YGVAVGVI, 
AFLTCLLYLAI,DVYFPQISSVKDRKK\AVriSGHPVVSGEPHPAA 
FWAFLWFTGDS C YL\ AN.QWQVS KPKDN PLNEGTDAS PGRPS PFS 
FFS I FTWSLTAAIAVRRFKDLS FQEEYSTLFP\ASAQP 


5998 


1612 


981 


DQQ ACLLGLMLTLEFG I LE FDPS W I GS WTQR / S W VS WRS R PGCE 
LFS I WFGS 1 VNEGYLNS AS EGEEFC I YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQ I S S VKDRKK\ AVLSGHP WSGEPHPAA 
FWAFLWFTGDSCYL\ANQWQVSKPKDNPLNEGTDAS PGRPSPFS 
FFS I FTWSLTAAIAVRRFKDLSFQEE YSTLFP\ASAQP 


5999 


2 


1790 


RPPMEKARRGGDGVPRGPVLHIWVGFHHKKGCQVEFSYPPIiIP 
GDGHDSHTLPEEWKYLPFLALPDGAHNYQEDTVFFHIiPPRNGNG 
ATVFG I SC YR \ QIEAKALKVRQAD I TRETVQKS VC VLS KL PL YG 
LLQAKLQL I THAY FEEKDFSQIS I LKEL YEHMNS S LGGAS LEGS 
QVYLGLSPRDLVLHFRH KGLI LFKLI LLEKKVLFYI S PVNKLVG 
ALMTVLSLFPGMIEHGLSDCSQYRPRKSMSEDGGLQESNPCADD 
FVSASTADVSHTNLGTIRKVMAGNHGEDAAMKTEEPLFQVEDSS 
KGQEPNDTNQYLKPPSRPSPDSSESDWETIiDPSVLEDPNLKERE 
QLGSDQTNLFPKDSVPSESLPITVQPQANTGQWLIPGLISGLE 
EDQYGMPLAI FTKG YLCLP YMALQQHHLLSDVTVRG FVAGATNI 
LFRQQKHLSDAI VEVEEALIQIHDPELRKLLNPTTADLRFADYL 
VRHVTENRUD VFLDGTG WEGGDEW IRAQFAVY I HALLAATLQLV 
LFRI VNVAKKI GNVMVTT\ SRNWQTGK\AVGQS VGGAFS \ S AK 
TA\MSSWLSTFTTSTSGSLTEPPDEKP 


6000 


101 


1561 


TEPCRTAENCTATMSENNKNSLESSLRQLKCHFTWNLMEGENSL 
UU rbUKVriK 1 1. r QNREF KATM ^NLI^YIjKHLKGQNEAALECLR 
KAEELIQQEHADOAHIRSLVTWGNYAWVYYHMGRLSDVQIYVDK 
VKHVCEKFSSPYRIESPELDCEEGWTRLKCGGNQNERAKVCFEK 
ALEKKPKNPEFTSGLAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NQ YLKVLLALKLHKMREEGEEEGEGEK\ LVEEALEKAPG \ VTD V 
LRSAA\KFYRGKDEPDKAIELLKKALEYIP\NNAYLHCQIGCCY 

rakvfqvmnlrengmygkrkllelighavahlkkajdeandnlfr 
vcs1 laslhaladqyedaeyyfqkefs keltpvakqllhlrygn 

FQLYQMKCEDKAIHHFIEGVKINQKSREKEKMKDKLQKIAKMRI. 
SKNGADSEALHVLAFLQELNEKMQQADEDSERGLESGSLIPSAS 
SWNGB 


6001 


176 


1038 


AFAHSPSRGHRKTHIHTPRHTPRCTMAESHLQSSLITASQFFEI 
WLHFDADGSGYLEGKELQNL I QELQQARKKAGLELS PEMKTFVD 
QYGQRDDGKIGIVELAHVLPTEENFLLLFRCQQLKSCE\EFMKT 
WRKYDTDHSGFI ETEELKNFLKDLLEKANKTVDDTKLAEYTDLM 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re s ponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine» C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine r T=Tbreonine, V=Valine, 
W tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKLFDSNWDGKLELTEMARLLPVQENFLLKFQGIKMCGKEFNKA 
FEL YDQDGNG YI DENELDALliKDLCE KNKQDLD INN I TT YKKNI 
MALSDGGKLYRTDLALILCAGDN 


j 6002 
6003 


977 


81 


LAPPGGGLHI PPRTPLSHSRPPPSHHAPHPS PLPLP PADLHPHS 
SMAQRSDLLEU3CQLTRDRVVVVSHDENLCRQSGLNRDVGSLDF 
EDLPLYKEKLEVYFSPGHFAHGSDRRMVRLEDLFQRFPRTPMSV 
E I KGKNEEL IREQ/ VLVR R YDRNE I TI WAS E KSS VMKKCKAANP 
EMPLS FT I SRG FWVLLS Y YLGLLP FI P I PE KFFFCFLPN I INRT 
YFPFS CS CLNQLLA WS KWL I MRKSLI RHL EERGVQ W FWCLNE 
ES DFEAAFS VGATG V I TP YPTALRH YLDNHGPAARTS 


6004 


140 


4098 


GKLRAFRGMRRLICKRICDYKSFDDEESVDGNRPSSAASAFKVP 
AP KTSGNP AN S ARKPG SAGG P KVGAGAS KEGG AGAVDEDDFI KA 

FTDVPSIQIYSSRELEETI^KIREILSDDKHDmX3RANALKKIR 
SLLVAGAAQYDCFFQHLRLI*DGALKLSAKDLRSQWREACITVA 
KLSTVLGNKFDHGAEAI VPTLFNLVPNSAKVMATSGCAAIRPi I 
RHTH VPRLI PL I TSN CTS KS VP VRRRS FE FLDLLLQEWQTHSLE 
RHAAVLVETI KKGIHDADAEARVEARKTYMGLRNHFPGEAETLY 
NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KWSTANPSTVAGR VS AG SS KAS SI»PGS1»QRS R S DIDVNAAAGAK 
AHHAAGQS VRSGRLGAGALNAG S YASLEDTSDKLDGTAS EDGRV 
RAKLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGS PGRVLT 
TTALS TVSS G VQRVL VNS AS AQKRS KI PRSQGCS REAS PSRLS V 
ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 
TGALYAPEVYGASGPGYGISQSSRLSSSVSAMRVL,NTGSDVEEA 
VADALLLGD IRTKKKPARRR YES YGMHSDDDANSDASS ACS ERS 
YSSRNGSIPTYMRQT\EDV\AEVLNRCASSNWSERKEGLI,GLQN 
IiKNQRTLSRVELKRLCE IFTRMFADPHGKRVFSMFLETLVDFI 
QVHKDDLQDWLFVLLTQLLKK^X5ADLLGSVGJuWQKALDVTRES 
FPNDLQFNILNRFTVDQTQTPSLKVKVAILKYI ETLAKQMDPGD 
FINSSETRLAVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 
FTMLLGALPKTFQDGATKIiLHNHLRNTGNGTQSSMGS P LTR PTP 
RSPANWS S PLTS PTNTSQNTLS PSAFD YDTENMNS EDI YS S L^G 

vteaiqnfsfrsqedmneplkrdskkddgdsmcggpgXmsdpra 

GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 
PFNKSALKEAMFDDDADQFPDDLSLDHSDLVAELLKEI^NHNER 
VEERKIALYELMKLTQEES FS VWDEHFKTILLLLLETLGDKE^T 
IRAIJVI,K\^EILRHQPARFKNYAELTVMKTLEAHKDPHKEVVR 
SAEEAASV\I^TSI\SPEQCIKVLCPIIQTADYPINLAAIKMQT 
KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 

HAVIGDELKPHLSQLTGSKMKLLNIiYI KRAQTGSGGADPTTDVS 
GQS 




140 


4098 

< 
1 
1 


GKLRAFRGMRRLI CKRICD YKS FDDEESVDGNRPSSAASAFKVP 
APKTSGNP ANSARKPGSAGGPKVGAGAS KEGGAGAVDEDDFI KA 
FTDVPSIQIYSSRELEETLNKIREILSDDKHDWDQRANALKKIR 
SLLVAGAAQ YDCFFQHLRLLDGALKLSAKDLRSQWREACI TVA 
HLS TVLGNKFDHGAEA I VP TL FNLVPNS AKVMATS G CAA I R F 1 1 
RHTHVPRLI PLITSNCTSKSVPVRRRS FEFTjDLLLQEWQTHSLE 
RHAAVLVETIKKGIHDADAEARVEARKTYMGLRNHFPGEAETLY 
NSLEPS YQKS LQT YLKS SGS VAS LPQS DRS SS SSQES LNRP FS S 
KWSTANPS TVAGRVSAGSSKAS S LPGS LQRSKSOI DVNAAAGAK 
AHHAAGQS VRSGR LGAGALNAGS YASLEDTSDKLDGTASEDGRV 
RAKLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGS PGRVLT 
TTALSTVS SG VQR VLVNSASAQKRS KI PRSQGCSREAS PSRLSV 
ARSSRIPRPSVSC^CSREASRESSRDTSPVRSFQPLASRHHSRS 
TGALYAPEVYGASGPGYGISQSSRLSSSVSAMRVLNTGSDVEBA 
VADALLLGD IRTKKKPARRR YES YGMHSDDDANSDASSACSERS 
5fSSRNGSIPTYMRQT\EDV\AEVLNRCASSNl7SERKEGLLGLQN 
[iLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 
3VHKDDLQDWLFVI^TQLLKKMGADLIX5SVQAKVQKALDVTRES 
?PNDLQFNI LMR FTVDQTQTPS LKVKVAI LK Y I ETLAKQMDPGD 
7 INS SETRLAVS RVI TWTTEPKSS D VRKAAQS VLISLFELNTPE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K= Lysine, 
L= Leu cine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine # 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FTMLLGALP KTFODGATKX,IjHNHLRNTGNGTOSSMGS PT .TP PTP 
RSPANWSS PLTS PTNTSQNTLS PSAFDYDTENMNSED I YSSLRG 
VTEAIQNFSFRSQEDMNEPIiKRI>SKKDDGDSMCGGPG\MSDPRA 
GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 
PFNKS ALKEAMFDDDADQFPD DLS LDHS DLVAEIjLKELSiSIHKER 
VEERKIALYELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 
IRALALKVLRE ILRHQ P AR F KN YAELT VMKTLEAHKDPHKEWR 
SAEEAAS V\ LATS I \S PEQCI KVLCPI I QTADYP I NLAAI KMQT 
KV I ERVS KETL>JLLL PE I MPG L. I QG YDNSES S VRKACV FCLVAV 
HAV I GDE LKPHLSQLTGS KMKLLNLYI KRAQTGSGGAD PTTDVS 
GQS 


6005 


133 


5955 


RS SGRRQEQLGQFPGRERKGMASGLGS PS PCSAGSEEEDMDALL 
NNSLPPPHPENEEDPEEDIjSETETPKLKKKKKPKKPRDPKIPKS 
KRQKKERMLLCRQLGDSSGEGPEFVEEEEEVALRSDSEGSDYTP 
GKKKKKKLGPKKEKKSKSKRKEEEEEDDDDDDDSKEPKSSAQLI* 
BDWGMEDIDHVFSEEDYRTLTNYKAFSQFVRPIilAAKNPKIAVS 
KMMMVLGA KWR 3 FSTNNP FKG SSG AS VAAAAAAA VAWE S MVTA 
TEVAPPPPPVEVPIRKAKTKEGKGPNARRKPKGSPRVPDAKKPK 
PKKVAPLKIKLGGFGSKRKRSSSEDDDLDVESDFDBASINSYSV 
SDG S TS RSS RS R K KLRTTKKKKKGEEEVTAVDG YETDHQD YCEV 
CQQGGEIILCDTCPRAYHMVCLDPDMEKAPEGKWSCPHCEKEGI 
QWE AKE DNS EGEE I LE E VGGDLEEEDDHHME FCRVCKDGGELLC 
CDTCPSSYHIHCLNPPLPEIPNGEWLCPRCTCPALKGKVQKILI 
WKWGQP PS PT PVPRP PDADPNTPS PKPLEGR PERQFF VKWOGMS 
YWHCSW VSELQLELHC \QVMFRNYQRKNDMDEP PSGDFGGDEEK 
S\RKRKNKDPKFAEMEERFYRYG I KPEW\MMIHRIIiNHSVDKKG 
HVHYLIKWRDLPYDQASWESEDVEIQDYDLFKQSYWNHRELMRG 
EEGRPGKXLKKVKLRKLERPPETPTVDPTVKYERQPEYLDATGG 
TLH P YQMEGLNWLRFS WAQGTDTI IiADEMGLG KTVQTAVFIiY S L 
YKEGHSKGPFLVSAPLSTIIN\WEREFEMWAPDMYV\VTYVGDK 
DSRAI IREKEFS \ FEDNAIRGGKKASRMKKEASVKFHVLLTS YE 
LITIDMAILGS IDWACLI VDEAHRLKNNQS KFFR VLNGYSLQHK 
LLLTGTPIiQNNLEEt*FHI»LNFX»TPERFHNLEGFIjEEFADIAKED 
QIKKLHDMLG\PHMLRRLKADVFKNMPSKTELIV\RVELSPM\Q 
KKYYK\ YILHS KFLKAI*N\ARGGGNQVSLLNVVMDLKKCCNHPY 
LFPVAAMEAPKMPNGMYDGSAilRASGKLIiLLQKMLKNLKEGGH 
RVLIFSQMTKMLDLLEDFLEHEGYKYERIDGGITGNMRQEAIDR 
FNAPGAQQFCFLLS TRAGGLG INLATADTVI I YDSDWN PHND I Q 
AFSRAHRIGQNKKVMIYRFVTRASVEERITQVAKJCKMMIjTHLVV 
RPGLGSKTGSMSKQELDDILKFGTEELFKDEATDGGGDNKEGED 
SSV I HYDDKAI ER LLDRNQDETEDTBLQGMNE YLS S FKVAQ YW 
REEEMGEEEEVERE I IKQEESVDPDYWEKLLRHHYEQQQEDIiAR 
NLGKGKR I RKQ VNYNDGSQEDRDWQDDQSDNQSD YS VAS EEGDE 
DFDERSEAPRRPSRKGLRNDKDKPLPPLLARVGGNIEVLGFNAR 
QRKAFLNAIMR YGMP PQDAFTTQWLVRDLRGKSE KEFKAYVSL F 

EFEHVNGRWSMPEIAEVEENKKMSQPGSPSPKTPTPSTPGDTQP 
NTPAPVP PAEDGI KIEENSLKEEES IEGEKEVKSTAPETAI ECT 
QAPAPASEDEKVWEPPEGEEKVEKAEVKERTEEPMETEPKGKG 
AADVEKVEE KSAI DLTPI VVEDKEEKKEEEEKKE VMLQNGETP K 
DLmDEKQKKNIKQRFMFNIAI)GGFTE3^SLWQNEEPJ^ATVTKKT 
YEI WHRRHD YWLLAG I INHGYARWQD I QNDPRYAI 1»N3 PFKGEM 
NRGNFLEIKNKFLARRFKLLEQALVIEEQIaRRAAYLNMSEDPSH 
PSMALNTRFAEVECLAESHOHLSKESMAGNKPANAVLHKVLKQL 
EELLSDMKADVTRLPATI ARI PPVAVRLQMSERN I LSRLANRAP 
EPTPQQVAQQQ 


6006 


1 


965 


DNDFLRNTVHRHE P P VTAEP I RLLAENE DWWDKPSS 1PVHPC 
GRFRHNTVIFIIX5KEHQIjKELHPI*HRLDRLTSGVIiMFAKTAAVS 
ER I HEQ VRDRQLE KE Y VCR VEGE FPTEE VTCKEP I LWSYKVGV 
CRVDPRGKPCETVFQRLSYNGQSSWRCRPLTGRTHQIRVHLQF 
LGHPILKDPI YNSVAWGPSRGRGG YIPKTNEELLRDLVAEHQAX 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=L.ysine, 
I>=I>eucine, M:=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, Rt=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y= Tyrosine, X -Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








QSLDVLDIjCEGDLSPGIjTDSTAPSSBLGKDDLEELAAAA\QKME 
E VAE AAPQEI*DTI ALAS E KA VETDVMNQ \RQT\ TLCRV PAG ATG 
SLAPRPCDVPTCPTL 


6007 


3 


2351 


HELGQ VE Y V FTDKTGTLTENEMQ FRECS I NGM KYQE I NG RLVPE 

gptpdssegnlsylsslshlnni»shlttsssfrtspenetelik 
ehdlffkavslchtvqinnvqtdctgdgpwqsnlapsqleyyas 
spdekalveaaarigi vfignseetmevktlgklery kllhi le 
fdsdrrrmsvivqapsgekllfakgaessilpkciggeiektri 

HVDEFALKGLRTLCIAYRKFT£KEYEEIDKRrFEARTALQQR\E 
EKLAAVFQFI EKDLI LLGATAVEDRLQD KVRETI EALRMAG I KV 
W VLTGD KHETAVS VS LS CGHFHRTMN I LELI NQ KSDSECAEQLR 
QLARR I TEDHVT QHG LWDGTS LSLALREHEKLFME VCRNCSAV 
LCCRMAPLQKAKVIRL I KIS PE KP I TLAVG DGANDVSM I QEAHV 
GIGIMGKEGRQAARNSDYAIARFKFLSKLLFVHGHFYYIRIATL 
VQY FF YKNVCFI TPQFIiYQFYCLFSQQTLYDSVYI*TIjY \NI CFT 
SLP I LI YSLLEQHVDPHVLQNKPTLYRDISKNRLLS IKTFL YWT 
ILGFSHAFIFFFGSYIiLIGKDTSLLGKTGQMFGNWTFGTLVFTVM 
VITVTV KMALETHFWTWTNHLVTWGS 1 1 FYFVFSLF YGG I LWPF 
LGSQNMYFVFIQLLSSGSAWFAI I LMWTCL PL D 1 1 KKVFD RHL 
HPTSTEKAQLTETNAGIKCLDSMCCFPEGEAACASVGRMLERVI 
GRCSPTHISRSWSASDPFYTODRSILTLSTMDSSTC 


6008 


4554 


1089 


AGVRRAGARRGPGRALPAGATAVPPPSARRRRRCPAPEHAGPAR 
ASRPSQETMFQLP VNNLGSLRKARKTVKKIIjSD IGLEYCKEHI E 
DFKQFEPNDFYIiKNTTWEDVGLWDPSLTKNQDYRTKPFCCSACP 
FSSKFFSAYKSHFRNVHSEDFENR I LLNCPYCTFNADKKTLETH 
I KI FHAPNASAPSSSLSTFKDKNKNDGLKPKQADS VEQAVY YCK 
KCTYRD PLYE I VRKHI YREH FQHVAAP YI AKAGE KS LNGAVPLG 
SMAREES S I HCKRCL FMP KS YEALVQHVT EDH ER IG YQ VTAM I G 
HTNVWPRS KPLML I APKPQDKKSMGLPPR IGS LASGNV\RS LP 
SQQMVNRLSI PKPNLNSTGVNMMSS VHLQQNNYGVKS VGCGYSV 
GQSMRI*GLGGNAPVSIPQQSQSVKQLLPSGNGRSYGLGSEQRSQ 
APARYS LQSANASSIiSSGQLKS PSLSQSQASRVLGQSSSKPAAA 
ATG PP PGNTSSTQ KWKICTI CNELFPBNVYSVHFEKEHKAEKVP 
AVANYIMKIHNFTSKCLYQJRYLPTDTLIJ«^IHGLSCPYCRS 
TFNDVEKMAAHMRMVH I DEEMG PKTDSTTiS FDLTLQQGSHTNI H 
LLVTT YKLRDAPAES VAYHAQNNPPVP P KPQPKVQEKAD I PVKS 
SPQAAVPYKKDVGKTIiCPIiCFS IIiKGP I SDALAHHLRERHQVIQ 
TVHP VEKIOjT YKCI HCXGVYTSNMTAST ITLHLVHCRG VGKTQN 
GQDKTWAPSRLNQSPSIiAPVKRTYEQMEFPLLKKRKLDDDSDSP 
S FF EEKPEEP WIiALDPKGH\ EDDS YEARKS FI/TKYFT\ KQP YP 
TRREIEK1lAASLWV\WK\SDIASHFSNKRKKCVRDCEKYKPGVL 
I^FNMKELNKVKHEMDFDAEGLFENHDEKDSRVNASKTADKKLN 
LGKEDDSSSDSFENLEE3SNESGSPFDPVFEVEPKISNDNPEEH 
VLKVIPEDASESEEKLDQKEDGSKYETIHLTEEPTKI/MHNASDS 
EVDQDDWEWKDGAS PS ESGPGSQQVSDFEDNTCEMKPGTWSDE 
SSOSEDARSSKPAAKKKATMQGDREQLKWKNSSYGKVEGFWSKD 
QSQWKNASENDERLSNPQIEWQNSTIDSEDGEQFDNMTDGVAEP 
MHGSLAGVKLSSOQA 


6009 


4272 


1534 


CHGLQHLTPFRELNI,SLQG*EPH*AA*QAVRSEEKSIC*GSPSC " 
HLVLGVLVP VARQSSHSAG PAQS AFR * TGTGSGTPKAAEQS G YW 
EAYTLGHQHWNMFPIQRPPLVMKGRRIMCGKCEKG*VSDSVTGG 
RAVAGEQAS QRRTVFTAGGGECLGAKS VRAS VFTGNQ PGVMG LL 
NGKRGGCFE SG YL FG F I VIGKIQS LEAKVPLPVNGQTGERAS PG 
NCRIHI VDAVC* SEHH*DHFLAAAFLENSTI IS * VAPGSWQDHA 
VLQ KE VQAS VRCRGFES VDTAPAGFWAHS P PGLQG EPTTTS VSL 
FVLAPQDGEGVPFVEGQLVTVLGLWPQS IRHTFVHHTQLFLHP 
I * KLGALDVAFLHLLTLVCSSFNVAYG*GKNGGTTLHQL,FABVN 
AVTRGSAVQRRPS IT I S S I HVDTKI QQELHDVMVAGAtXSVVQWG 
DPFWGLAGIFHI,IDDPLHQIELSFQRRV*EQCQGVKPDSQPVP 
RPLRVGbLQVGPLVRGGGRRVAGRGKRCWRDLLFPWRWGLSHRT 
RDLLRGGDRGHVVVIVIiC3lLGSLVGGLGTDBLLWFGGR*LIIIG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A- Alanine , C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
II=IIistidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine , R-Arginine, 
S=Se rine, T=Threonine, V=Valine, 
W -Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion} 








I * * RGRLSGEWGCGLGRGELFQVSIGIGVS I VHIGQGDHEVLGG 
AGLVERGALHATGQG VE ALVQQ LLDVGPAGALGLCDGAALFQG P 
GRVGQLPAEGLQVCITLVAQWRMHDGRELGGAEWPWQALHGAAI 
CG VGGAILLKALSQ Y FLKGG * RLVJ CARGQ* PVKKRQRRWRG * TR 
R *NGLTIHCFN* LI *GAVCCRLVI LRWCGLLEVHGVYGT* IHCL 
GS FPGRLWP + PFI SQERPNGHCQWE FRLAVPSWKCRWSRWRVRG 
TWRYGNPLLNLL*GAWLGGAACGGQQGGPLSTWQACTGPGQAAF 
LP P FQGACRPRTQRCRT WVCPI AWRQLLAYTRD 


6010 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVMENSKVLGESM 
AG I S QNAKTGDLPAFGEC VGI AS KALCGLTEAAAQAAYLVGI FD 
PNSQ AGHQG LVD PIQ FARANQA I QMACQNLVD PGS S PS QVLS AA 
TI VAKHTSALCNACRIAS S KTANPVAKRH FVQSAKEVANSTANL 
VICT I KALDGDFS EDNRNKCRI ATAPL I EAVENLTAFASNPE FVS 
I PAQ I S S EGSQAQE P I LVS AKPMLE S S S YLIRTARS LA INP KDP 
PTWS VLAGHSHTVSDS I KSLITS I RDKAPGQRECDYSIDGINRC 
IRDIBQASLAAVSQSLATRDDISVEALQEQLTSVVQEIGHLIDP 
I ATAARGE AAQLGHKGTQ LAS YFE PL I LAAVGVAS KI LDHQQQM 
TVLDQTKTLAESALQMLYAAKEGGGNPKAQHTHDAITEAAQLMK 
EAVDD I M VTI »NE AASEVGLVGGMVDA I AEAMS KLDEGT PPE P KG 
TFVDYQTTWKYSKAIAVTAQEMMTKSVTNPBELGGLASQMTSD 
YGHLAFOGQMAAATAEPEE I G FQI RTR VQDLGHGC I FLVQKAG\ 
ALQVCPTDS YTKRE LI ECARAVTEKVS LVLSALQAGNKGTQACI 
TAATAVSGI IADLDTTI MFATAGTLNAENS ETFADHREN I LKTA 
KALVEDTKLLVSGAAST PDKLAQAAQSSAATI TQLAEWKLGAA 
S LGS DI>PETQ WLI NAI KDVAKALSDL I S ATKGAASKPVDD P SM 
YQLKGAAKVMVTNVTS LLKT VKAVEDEAU'RGTRALEAT I EC I KQ 
ELTVFQS KDVPEKTSS PEES I RMTKG I TMAT AKA V AAGNS CRQE 
DVI ATANLSR KAVSDMLTACKQAS FHPDVSDE VRTRALRFGTEC 
TLGYLDLLEHVLVI^QKPTPELKQQLAAFSKRVAGAVTELIQAA 
EAMKGTEWVDPBDPTVIAETELLGAAAS I EAAAKKLEQLKPRAK 
P KQADETLDFEEQ I LEAAKS I AAATS ALVKSAS AAQRELVAQG K 
VGS I PANAADDGQWS QGL I SAARMVAAATSSLCE AANASVQGHA 
S EEKLI S S AKQVAAS TAQLLVACKVKADQDS EAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDDWVKTKFVGGI AQI I AAQEEM 
LKKERELEEARKKLAQIRQQQYKFLPTELREDBG 


6011 


446 


1835 


LLQPAMRKSPGLSDCLWAWILLLSTLTGRSYGQPSLQDELKDNT 
TVFTRI LDRLLDGYDNRLRPGLGERVTEVKTDI FVTS FGPVSDH 
DMEYTIDVFFRQSWKDERLKFKGPMTVLRIJ^NLMASKIWTPDTF 
FHNGKKSVAHNMTMPNKLLRITSIX3TLLYTMRLTW\AECPMAF 
GRDFPM\D\AHACPLKFGSYAYTRAEWYEWTREPARSVWAED 
GSRIjNQYDLU^TVDSGIVQSSTGEYVVMTTHFHLKRKIGYFVI 
QTYLPC I MTV I LSQVS FWLNRESVPARTVFGVTTVLTMTTLS I S 
ARNSLPKVAYATAMDWFIAVCYAFVFSALIEFATVNYFTKRGYA 
WDGKSVVPEKPKKVKDPLIKKNNTYAPTATSYTPNLARGDPGLA 
TIAKSATI EPKEVKPETKPPEPKKTFNSVS KIDRLSRIAFPLLF 
GIFNLVYWATYLNREPQLKAPTPHQ 


6012 


351 


5013 


PAELFQS FAI WHKELYDWRLGP WNQCOPVI SKSLEKPLECI KGE 
EGIQVREIACIQKDKD I PAED 1 1 CEYFEPKPLLEQACLI PCQQD 
CIVS EFS AWS ECS KTCGSGLQHRTRHWAPPQFGGSGCPNLTEF 
QVCQSSPCEAEELRYSLHVGPWSTCSMPHSRQVRQARRRGKNKE 
REKDRSKGVKDPEAREL I KKKRNRNRQNRQENRYWD I Q IG YOTR 
EVMCINKTGKAADLSFCQQEKLPMTFQSCVITKECQVSEWSEWS 
PCS KTCHDMVS PAGTRVRTRTI RQ FP IGS E KECPE FEE KE PCLS 
QGDGWP CATYGWRTTEWTECRVT) PLLSQQDKRRGNQTALCGGG 
I QTRE VYCVQANENLLS QLSTHKNKE ASKPMDLKLCTGP I PNTT 
QLCHIPCPTECEVSPWSAWGPCTYENCNDQQGKKGFKLRKRRIT 
NEPTGGSGVTGNCPHLLEAIPCEEPACYDWKAVRLGDCEPDNGK 
ECGPGTQVQEWCiNSDGEEVDRQLCRDAI FP I PVACDAPCPKD 
CVLSTWSTWSSCSHTCSGKTTEGKQI RARS I LAYAGEEGGIRCP 
NSSALQEVRS CNEHPCTVYHWQTGPWGQCI EDTS VSS FNTTTTW 
NGEASCSVGMQTRKVI CVRVNVGQVGPKKCPESLRPETVRPCLL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=* Phenyl alanine, G=Glycine, 
H«Histidine, I=»Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P- Proline, Q=Glut amine, R=Arginine, 
S=Serine, T-Threonine, V=valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PCKJCDCIVTPYSDWTSCPS\SCKEGDSSIRKQSRHRVIIQLPAN 
GGRDCTDPLYEEKACEAPQACQSYRW\KTHKW\HRCQ\1*VP\WS 
VQQDS P\GAQEGCG PGRQARA I TCRKQDGG0AGI HECLQ YAGP V 
PALTQACQ I PCQDDCQLTS WS KFSS CNGDCGAVRTRKRTIiVGKS 
KKKE KCKNSHLYPL I ETQ YCPCDKYNAQ P VGNWSDC I LPEGKVE 
VLLGMKVQG D I KECGQG YR YQAMACYDQNGRLVETSRCNS HG Y I 
EEACI I PCPSDCKLSEWSNWSRCSKSCGSGVKVRSKWLREKPYN 
GGR PC PKLDHVNQAQ VY E VVPCH SDCNQ YLWVTE PWS I CKVTF V 
NMRENCGEGVQTRKVRCMQNTADGPSEHVEDYLCDPEEMPLGSR 
VCKLPCPEDCVISEWGPWTQCVLPC^QSSFRQRSADPIRQPADE 
\zu.s>K, fWM v ciu^f L.W JjN issi c. xn i u iNVIDWSTCQL»SEKAVCGNGI 
KTRMLlXlVRSDGKSVDLKYCEALGbEKNWQMNTSCMVECPVNCQ 
L S D W S PWSECSQTCGLTGKM IRRRT VTQ P FQGDGRPCP S LMDQS 
KPCPVKPCYRWQYGQWSPCQVQEAQCGEGTRTRNISCWSDGSA 

isuc oi\v vi/aiiT > — f-u_i_L JL IJVsiN iST»l*l v 1 ir.r.St N.t^lMC H^jL^vJx ijKJJW 

SSWSLCQLTCVNGEDLGFGGIQVRSRPVIIQELENQHLCPEQMIj 
ETKSCYDGQCYEYKWMASAWKGSSRTVWCQRSDGINVTGGCLVM 
SQP DADRS CN P PCSQPHS YCSETKTCHCEEG YTE VMS SNS TLEQ 
CTI* I P VWL P TMEDKRGDVKTSRAVH PTQP S SNPAGRGRTW FLQ 
PFGPDGRLKTWVYGVAAGAFVLLI FIVSMI YLACKKPKKPQRRQ 
NNRLKPLTLAYDGDADM 


6013 


1161 


710 


GAFIAGVPVQPVLIRYPNSLDTTSWAWRGPGVLKVLWLTASQPC " 
S I VDVEFLP VYHPSPE ESRDPTLYANNVQRVMAQALGI PATECE 
FVGSLPVIWGRLKVALEPQL/WGTGKSASEGWAVRWLCGRWGR 
ARPESKDQPGRVCQAATAIi 


6014 


2857 


613 


EAVAGGMEKS RMNLP KG PDTLCFDKDEFMKE0FD VDHFVSDCKK 
RVQLEELRDDLELYYKIiLKTAMVELINKDYADFXVNLSTWLVGM 
DKALNQLSVPLGQLREBVLSLRSSVSEGIRAVDERMSKQEDIRK 
KKM CVLRLI QVIRS VEKI EKILNSQSSKETSALEASS PLLTGQI 
LER I ATEFNQLQFHACQS K\GMPLLDKVRPRI AG ITAMLQQSDE 
GLLLEG LQTS DVDI I RHCLRTYATI DKTRDAEALVGQ VL VKP Y I 
DEVI I EQFVESHPNGLQVMYNKIiLEFVPHHCRLLREVTGGAI SS 
dAon i v fbiur ijV^^vvVFy I VQGl>fcJEKL»PSLFNPGNPDAFHEKY 
TISMDFVRRLERQCGSQASVKRIxRAHPAYHSFNKKWNLPVYFQI 
R FR E I AGSLEAALTD VI*EDAPAES P YCLIAS HRTWSSLRRCWSD 
EMFLPLLVHRLWRLHSGR FWARYSVFV\N\ELSLRPISNES PRE 
IKKPLVTGSKEPS rTQGNTEDQGSGPSETKP WS ISRTQliVYW 
ADLDKLQEQLPELLEI I KPKLEMIGFKNFSS ISAAIiEDSQSS FS 
ACVPSLSSKIIQDLSDSCFGFIiKSALEVPRLYRRTNKBVPTTAS 
SYVDSALKPLFQLQSGKKDKLKQAI I QQWLEGTLSESTHKYYET 
VSDVLNS VKKMEES LKRLKQARKTTPANPVGPSGGMSDDDK I RI* 
QLAL D VE YIX3 EQ I Q KLG LQAS D I KS FS ALAEL VAAAKDQATAEQ 
P 


6015 

> 


13 


2237 


AEGCAERRGT EPWELS M S W ESGAGPGLGSQGMDL.VWS AW YGKC 
VKGKG S I*PLS AHG I VVAWLSRAEWDQVTVYLFCJDDHKLQR YAIiN 
RITVWRS RSGNEL P LAVAS TADLI RCKIiIjDVTGGLGTDELRLIiY 
GMALVRFVNLI S ER KTKFAKVPLKCLAQE VNI PDWI VDLRHEIiT 
HKKMPHINDCRRGCYFVLDWLQKTYP7CRQLENSLRETWELEEFR 
EGI EEEDOEEDKN I WDD I TEQKPE PQDDGKSTESDVKADGDS K 

KAI KAWNNPS P RVEC VLAELKG VTCENREAVLDAFLDDGFLVPT 
FEQLAALQ I E YEENVDLND VLVPKPFSOFWQPLLRGLHSQNFTQ 
ALLERMLSELPAW3ISGIRPTY1J^WTVELIVANTKTGRNARXF 
SAGQWEARRGWRLFNCSAS LDWPRMVESCLGS PCWASPQLLR 1 1 
f\kamgqglqde\ EQEKLLRI CS I YTQSGENSLVQEGSEAS pi g 
KSPYTLDSLYWSVKPASSSFGSEAKAQQQEEQGSVNDVKEEEKE 
EKEVLPDQVEEEEENDDQEEEEEDEDDEDDEEEDRMEVGPFSTG 

qesptaenarliaqxrgal^sawqvssedvrwdtfp\lgrmpr 

SRPRTPAE LMLENYDTHV I FWTKPVL \ EQRLE PS TCK\TDTLGIj 
\SCGVGS\GNCSNSSSSi^RGAFLLEARGSLH\GL\KTGLQliF 


6016. 


13 


2237 


ASGCAERRGTEPWELSMS WESGAGPGLGSQGMDLVWSAW YGKC " " 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=» Lysine, 
I,=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, Vs=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\ -possible nucleotide insertion) 








V KGKGS Ii PbSAHG I VVAWLSRAE WDQVTVYLFCDDHKLQR YAIjN 

R ITVWRS RSGNELPLAVASTADLI RCKLLDVTGGLGTDELRLI,Y 

GMALVRFVNLI SERKTKFAKVPLKCIiAQEVNI PDW I VDLRHELT 

HKKMPKINDCRRGCYFVLDWLQKTYWCRQLENSLRETWELEEFR 

EGI EEEDQEEDKN I WDDITEQKPEPQDDGK3TESDVKADGDSK 

peppimcur'irifaT ctjvt?t VT?T>3iT7Trr .TAr*5VPB , PriPTVT.T> , VPDVT.P 
Lroijt. VUotl^ivtS>AljoMivIjlj I iirCtt_K..£l.l-Ji-> v i> i .C.iLc.yr l v jjClvr K. X XJf 

KAI KAKNNP S PRV E C VLAEL KGVTCENRE AVLDAFLDDGFLVPT 
FEQLAALQIEYEENVDLNDVLVPKPFSQFWQPLLRGLHSQNFTQ 
ALLERMLS ELPALG I SGI RPTY I LRWTVELI VANTKTGRNARRF 
S AGQWEARRGWRLFNCS ASLDV7PRMVESCLGS PCWASPQLLRI I 
F\KAMGQGLQDE\EQEKL.LRICSIYTQSGENSLVQEGSEASPIG 
KS PYTLDSLYWSVKPASS SFGSEAKACQQEEQGSVNDVKEEEKE 
EKEVLPDQVEEEEBNDDQEEEEEDEDDEDDEEEDRMEVGPFSTG 
QES PTAENARIiLAQ KRG ALQGS AWQVSSEDVRWDT F P \ LGRM PR 
SRPRTPAELMLENYDTHVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\SCGVGS \GNCSNS SSSNFRGAFLLEARGSLH\GL\KTGLQLF 


6017 


203 


3469 


SHQE IEQNSAMAPRKRGGRGIS FI FCCFRNWDHPE ITYRLRNDS 
NFALQTMEPALPMP PVE ELDVM FS ELVDELDLTDKHREAM FALP 
AEKKWQIYCSKKKDQEENKGATSWPEFYIDQLNSMAARKSLLAL 
EKEEEEERS KTI ESLKTALRTKPMRFVTRF I DLDGLSC I LNFLK 
TMDYETSESRIHTSIjIGCIKALMNNSQGRAHVLAHSESINVIAQ 
SLSTENI KTKVAVLEI JLGAVCLVPGGHKKVLQAMLHYQKYASER 

trfqti>indld:<stgryrdevslktaimsfinavlsqgagvesl 
dfrlhlrye\fl.mlgihpvmdklrkhenstldrhijdffemlrne 
delefakrfelvhidtksatqkfeltrkrlthseayphfmsilh 
hclqmpykrsgntvqywllldr 1 iqqi viqndkgqdpdstplen 
fni knvvrmlvnenevkqwkeqaekmrkehnelqqklekkerec 
daktqekeemmqtlnkmkeklekettehkqvkqqvadltaqlhe 
lsrravcas i pggpspgapggpfpss vpgsllpp p pppplpggm 
lpp pppplp pggpppppgppplgaimp ppgapmglalkkks i pq 

PTN ALKS FN WS KLPENKI>EGTVWTE IDDTKVFKI I»DLEDLERTF 
SAYQRQQDFFVKSNSKQKEADAIDDTLSSKLKVKELSVIDGRRA 
QKCNI LLSRLKLSNDE I KRAI l/TMDE QEDLP KDMI>EQLLKFVPE 
KSDIDLLEEHKHELDRMAKADRFLFEMSRINHYQQRIjQSI.YFKK 
KFAERVAEVKP KVEAIRSGS EEVFRS GAL.KQLLEWLAFGN YMN 
KGQRGNAYGFKISSLiNKIADTKSSIDKNITLLHYLITIVENKYP 
SVLNLNEELRD IPQAAKVNMTEliDKE ISTLRSGLKAVETELE YQ 
KSQP PQPGDKFVS WSQFITVASFS FSDVEDLLAEAKDLFTXAV 
KHFGEEAGKIQPDEFFG I FDQFLQAVSEAKQENENMRKKKEEEE 
RRARMEAQLKEQRERERKMRKAKENSEESGEFDDLVSALRSGEV 
FDKDLSKLKRNRKRITNQMTDSSRERPITKLNF 


6018 


13 


2510 


TISQSGGI RRRREAVWFE WNMDFSRLHMYS PPQCVPENTGYTY 
ALSSSYSSDALDFETEHKLDPVFDSPRMSRRSLRIiATTACTLGD 
GEAVGADSGTSSAVSIiKNRAARTTKQRRSTNKSAFSINHVSRQV 
TSSG VS YGGTVSLQDAVTRRP P VLDES W I REQTTVDHFWGLDDD 
GDL.KGGN KAA I QGNGD VGAG AATGHNG F FCSNCNML S 3RKD V1.T 
AHPAAPGPVSRVYSRDRNQ KCDDCKGKRHLDAHPGRAGTLWH I W 
ACAGYFLLQILRRIGAVGQAVSRTAWSAIjWLAWAPGKAASGVF 
WWLGIGW YQFVTLI S WLNVFLLTRCLRNI CKFLVLLI PLFLLLG 

PLQGD SEAFP WHWMSGVEQQVAS LSGQ CHHHGENLR ELTTLLQK 
LQARVDQMEGGAAGPSASVRDAVGQPPRETDFMAFHQEHEVRMS 
HLEDILGKXrREKSEA I QKELEQTKQKTI SAVGEQLIjPTVEHI*QL 
ELDQLKSELSSWRHVXTGCETVDAVQERVDVQVREMVXLLFSED 
QQGGSLEQLLQRFSSQFVSKGDLQTMLRDLQLQILRNVTHHVSV 
TKQLPTSEAWSAVSEAGASGITEAQARAIVNSALKIjYSQDKTG 
MVDFALESGGGSILSTRCSETYETKTALMSIiFGIPIiWYFSQSPR 
WIQPDI YPGNCWAFKGSQGYLWRLSMMIHPAAFTLEHI PKTL 
S PTGNI SSAP KDFAV YGl.iENE YQEEGQLI*GQFTYDQDGE S LQMF 
OALKRPDDTAFQ I VELRI FSNWGHPE YTCLYRFRVKGEPVK 


6019 


2 


1066 


TPNDREPPPQRPPSSRRASHIiAQBITSAASLGnQTQILGSLrXA 



428 



BNSDOCID: <WO 0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Amino acid segment containing signal peptide " 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 

1 H-Histidine, I«Isoleucine, K=Lysine, 

j ULeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S-Serine, T- Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 

I \=possible nucleotide insertion) 








PVITSAIRSMPGISSQILTNAQGQVIGTLPWVVNSASVAAPAPA 
QSLQVQAVTPQLLLNAQGQVI ATLASS PLPPPVAVRK\ PSTPES 
LLKSEVQPIKPTPTVPQPAWIASPAPAAKPSASAPIPITCSET 
PTVSQLVSKPHTPSLDEDGINLEEIREFAKNFKIRRLSLGLTQT 
QVGQALTATEGPAYSQSAICRFEKLDITPKSAQKLKPVLEKWLN 
EAELRNQEGQQNLMEFVGGE PS KKRKRRTS FTPQAI EALNAYFE 
KNPLPTGQE I TE I AKELNYDRE WRVWFCNRRQTLKNTS KLNVF 
1 Q IP 


6020 


4953 


| 54 9 


EAIQFEVSIGNYGNKFDTTCKPLASTTQYSRAVFDGNYYYYLPW 
AHTKPVVTLTS YW ED I SHRLDAVNTLLAMAERLQTNIEAbKSG I 
QG K I PAN QLAEL WLKL I DEV I EDTR YT L P l.TEGKANVTVLDTQ I 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPQNSKPDI I IWMIRGEKRLAYARI PAHQVLYSTSGENASGKYC 
GKTQTIFLKYPQEKNNGPKVPVELRVNIWLGLSAVEKKFNSFAE 
GT FT VFAEM YENQ ALMFGKWGTS G LVGRHK FSDVTG K I KLKREF 
FLP ? KG WE WEGEW 1 VDP ERSLLTE ADAGH TE FTDEVYQNESR YP 
GGDWKPAEDTYTDANGDKAAS PS ELTC PPG WEWEDDAWS YDINR 

avdekgweygitippdhkpkswvaaekmyhthrrrrlvrkrkkd 

LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPS ETHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 
TTVFGANTP I VSCKFDRDYI YHLRCYVYQARNLLALDKDS FSDP 
YAHICFLHRSKTTEIIHSTLNPTWDQTIIFDEVEIYGEPQTVLQ 
NPP KVIMELFDNDQVGKDEFLGRS I FSP WKLNSEMD ITPKLLW 

| hpvmngdkacgdvlvtaelilrgkdgsnlpilppqrapnlymvp 

1 QGIRPWQLTAIEIIAWGLRNMKNFQMASITSPSLVVECGGERV 
ES WX KNIiKKTPNFPSS VLFMKVFLPKEEL YMP PLVI KVIDHRQ 
FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
D I VIEMEDTKPLLASKCLSSMSTALS KMAS PATVHLTBKEEEI V 
DWiTSKFYASSGEHEKCGQYIQKGYSKLKXYNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 

RQFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITLG 
KKVIE\DRDHYTPNTIoNPVFnRMYPT^nvT.pnvw'nT n-rownvn 

TFTRDEKVGBTIIDLENPF\LSRFG\SHCG\IPEEYCVSGVNTW 
RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 
FEANKILHQHLGAPEERIxALHILRTQGI.VPEHVETRTl^STFQP 
NIS\RY YLRVI IWNTKDVILDEKS ITGEEMSDI YVKGWI PGNEE 
NKQKTDVHYRS LDGEGNFNWRFVFPFDYLPAEQLC I VAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKSPGGNC/RGLDMI PDLKAMNPLKAKTAS L FEQ KSMKGWW 
PCYAEKDGAR VMAG KVEMTLE ILNEKEADERPAG KGRDEPNMNP 
KLDLPNRPETS FIiWFTNPCKTMKFI VWRR FKWVI IGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6021 


4953 


549 ' T 

1 


EAIQFEVSIGNYGNKFDTTCKPLASTTQYSRAVFDGNYYYYLPW 
AHTKPVVTLTS YWEDI SHRLDAVNTLLAMAERLQTNIEALKSG I 
QGKI PANQLAELWLKLIDEVI EDTRYTLPLTEGKANVTVLDTQ I 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPQNSMPDI I IWMIRGEKRLAYARI PAHQVLYSTSGENASGKYC 
GKTQTI FLKYPQEKNNGPKVPVELRVNI WLGLSAVEKKFNSFAE 
GTFTVFAEMYEKQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKGWEYGITIPPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 
LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAP S ETHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 
TTVFGANTP I VSCNFDRD YI YHLRCYVYQARNLLALDKDS FSDP 
YAHI CFLHRS KTTB I IHSTLNPTWDQTI IFDEVEI YGEPQTVLQ 
NPPKVIMELFDNDQVGKDEFLGRS I FSPVVKLNSEMDITPKLLW 
HPVMNGDKACGDVLVTAELI LRG KDGSNLP I LP PQRAPNLYMVP 
2G IR P WQLTA IE I LAWGLRNM KNFQMAS I TS PSLWECGGERV 
ESWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=bysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y»Tyrosine, X^Un known, *=3top 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insert ion ) 








DIVIEMEDTKPLLASKCLSSMSTALSKMASPATVHLTEKEEEIV 
DWWSKFYASSGEHEKCGQ Y IQ KGYSKLK3 YNCELENVAEJEGLT 
DFSDTFKbYRGKSDENEDPS WGE FKGS FR1 YPLPDDPSVPAP P 
RQFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITI>G 
KKVIE\DRDHYI PNTLNPVFGRMYELSCYLPQEKDLKI SVYDYD 
TFTRDEKVGETI IDLENPF\LSRFG\SHCG\ I PEEYCV SGVNTW 
RDSLR\PTQ\I*LQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 
FEANKILHQHLGAPEER1ALHILRTQGLVPEHVETRTLHSTFQP 
NI S \ R YYLRVI I WKTKD V I LDEKS I TGE EMSD I YVKGW I PGNE E 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\IiDDYLGFPRTI»TCRHTI 
HFLQKSPGGNC/RGLDM I PDLKAMNPLKAKTASLFEQKSMKGWW 
PCYAEKDGARVMAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKVrVIIGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPKV 


6022 


4953 

- 


549 


EAIQFEVSIGNYGNKFDTTCKPliASTTQYSRAVFDGNYYYYLPW 
AKTKP VVTLTS Y WED I SHRLDA VNTLLAMAERLQTN IEALKSG I 
QGKI PANQLAELWLKLIDEVI EDTRYTLPLTEGKANVTVLDTQI 
RKI^RSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPQNSMPD3 1 IWMIRGEKRLAYARI PAHQVLYSTSGENASGKYC 
GKTQTI FLKYPQEKNNGPKVPVELRVNI WLGLSAVEKKFNS FAE 
GT FTVFAEM YENQALMFGKWGTSGLVGRHK FSDVTGKI KI*KREF 
FLP P KG WEWEGEW I VDPERSLLTEADAGHTEFTDE V YQNES R Y P 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKGWEYGITI PPDHKPKS WVAAEKMYHTHRRRRLVRKRKXD 
LTQTAS STAG AMEELQDQEGWEYAS L IGWKFHWKQRSS DTFRRR 
RWRRKMAPSETHGAAAI FKLEGALGADTTEDGDE KS LEKQ KHSA 
TTVFGANTP I VSCNFDRDYI YHLRCYVYQARNLLALDKDS FSDP 
YAH I CFLHRSKTTEI IKSTLNPTWDQTI I FDEVE I YGEPQTVLQ 
NPP KVI MEIjFDNDQVGKDE FLGRS I FS PWKLNS EMD ITP KLL W 
HPVMNGDKACGDVLVTAEI*I LRGKDGS NLPI LP PQRAPN L YM VP 
QGIRPWQLTAI EI LAWGLRNMKNFQMAS ITSPSI*WECGGERV 
ESWIKWLKIO'PNFPSSVIjFMKVFLPKEELYMPPLVIKVIDHRQ 
t-GRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
D I VI EMEDTKPIOASKCLSSMSTALSKMAS PATVHLT3KEEE IV 
DWWSKPYASSGEHEKCGQYI QKG YSKLKI YNCELENVAEFEGI/T 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDSVPQECTVRIYIVRGliELQPQDNNGLCDPYIKITLG 
KKVIE\DRDHYI PNTIiNPVFGRMYELS CYLPQEKDLKI SVYDYD 
TFTRDEKVGETIIDI»ENPF\IjSRFG\SHCG\IPEEYCVSGVNTW 
RDSLR\ PTQVLLQNVARFKGFPQP ILSEDGSRIRYGGRDYSLDE 
FEANKILHQHLGAPEERIAIiHILRTQGLVPEHVETRTLHSTFQP 
N I S \RY YLR VI IWNTKDVILDEKS ITGEEMSDI YVKGWIPGNEE 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
S IDQTEFRI PPR\LI IQIW\DNDKFS\l»DDYLGFPRTLTCRHTI 
HFLQKSPGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKGWW 
PCYAEKDGARVKAGKVEMTLE I LNE KEADERP AGKGRDE PNMNP 
KLDIiPNRPETS FLW FTNPCKTMKFI VWRR FKWVI IGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6023 


102 


916 


S OELGMF VEI*NNLI*NTTPDRAEQGKLTLLCTDAKTDGS Fit VHHFL 
S FYLKANCKVCFVALIQSFSHYS I VGQKLGVSLTMARERGQLVF 
LBGL/rVCSGR\VFQAQKBPHPLQFLREANAGNLKPLFEFVREA 
LKPVDSGEARWTYPVLLVDDLSVIjLSLGMGAVAVIiDFIHYCRAT 
VCWELKG1W5VVLVHDSGDAEDEENDILLNGLSHQSHLILRAEGL 
ATGFCRDVHGQLRI LWRRPSQPAVHRDQS FT YQ YKI QDKS VS FF 
AKGMSPAVL 


6024 


3 


3260 


FLSFLCYPRFRCLFCLQFAIPASRMEQLNELELLMEKSFWEEAE 
LPAELFQ KKWAS FPRTVLSTG^NR YLVIxAVNTVQNKEGNCEK 
RLVITASQSLENKELCIIiRNDWCSVPVEPGDIIHLEGDCTSDTW 
1 1 DKDFGYIj I L» YPDMI* I SGTS IAS S I RCMRRAVLSETFRS SDPA 
TRQMLIGTVLHEVFQKAINNSFAPEKLQELAFQTIQEIRHLKEM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corrc3pondi ng 
to first 
amino acid 
residue of 
amino acid 


Ammo acid segment containing signal peptide 
{A=Alanine. C=Cysteine, D^Aspartic Acid, E- 

Glutamic Acid, F=Phenylalanine, G=Glycine, 

H=llistidine, I=Isoleucine, K= Lysine, 
I L=Leucine. M=Methionine, N=Asparagine, 

P-Proline, Q=Glutamine, R=Arginine, 

S^Serine, T=Threonine, V^Valine, 

W=Tryptophan, Y=Tyrosine, X-Unknown , *=Stop 
J Codon, /^possible nucleotide deletion, 

\=possible nucleotide insertion) 








YRLNLSQDEI KQEVEDYLPSFCKWAGDFMHKNTSTDFPQMQLSL 

PSDNSKDNSTCNIEWKPMDIEESIWSPRFGLKGKIDVTVGVKI 

HRGYKTKYKIMPLELKTGKESNSIEHRSQWLYTLLSQERRADP 

EAGLLLYIiKTGQM Y PVPANHLDKRELLKLRNQMAFS LFHRIS KS 

ATRQKTQLASLPQI IEEEKTCKYCSQIGNCALYSRAVEQQMDCS 

SVPIVMLPKIEEETQHLKQTHLEYFSLWCLMLTLESQSKDNKKN 

HQNIWLMPASEMEKSGSCIGNLIRMEHVKIVCDGQYLHNFQCKH 

GAIPVTNL^GDRVIVSGEERSLFALSRGYVKEINMTTVTCLLD 

RNLSVLPESTLFRLDQEEKNCDIDTPLGNLSKLMENTFVSKKLR 

uux lucKCfUl- i S Y L»SS VCPHDAKDTVAC I LKGLNKPQRQAMKK 

VLL S KDYTL I VGMPGTG KTTT I CTLVRI L YACGFS VLLTS YTHS 

AVDNII,LKLAKFKIGFLRSR\QIQKVHPAIQQFTEHEICRSKSI 

KS\LALLEELYTSQLIDATTCMGINHPIFSRKIFDFCIVDEASQ 

I S Q P I CLGPL F FS RRF VL VGDHQQLPPU VLNREARALGHS ESLF 

KRLEQNKSAWQLTVQYRMNSKIMSLSNKLTYEGKLECGSDKVA 

NAVXNLRHFKDVKLELEFYADYSDNPWLMGVFEPNNPVCFLNTD 

KVPAPEQVEKGGVSNVTEAKLIVFLTSIFVKAGCSPSDIGIIAP 

YRQQLKI INDLLARSIGMVEVNTVDKYQD\RDKS I VLVSFVRSN 

KDGTVGELLKDWRRT.NVAITRAKHKLILLGCVPSLNCYPPLEKL 

LNHLNSEKLIIDLPSREHESLCHILGDFQRB 


6025 
6026 


3977 


89 


GGFPAQSDHLPPVFPLRSDLLITMSTIiYVSPHPDAFPSLRALIA" 
ARYGEAGEGPG WGGAHPRI CLQPPPTSRTS FPPPRLPALEQGPG 
GLWVWG ATAVAQLLWPAG LGGPGGS RAAVLVQQtWS YADTE I>I P 
AACGATLPALGLRSSAQDPQAVLGALGRALSPLEEWLRLHTYLA 
j GEAPTLADLAAVTAIjIjLPFRYVLDPPARRI WNNVTRWFVTCVRQ 
PEFRAVLGEWLYSGARPLSHQPGPEAPALPKTAAQLKKEAKKR 
EKLEKFQQKQKIQQQQPPPGEKKPKPEKREKRDPGVITYDLPTP 
PGEKKDVSGPMPDS YS PRYVEAAW YPWWEQQGFFKPEYGRPNVS 
AANPRGVFMMCIPPPNVTGSLHLGHALTNAIQDSLTRWHRMRGB 
TTLWNPGCDHAGIATQVVVEKKLWREQGLSRHQLGREAFLQEVW 
KWKEE KGDR I YHQLKKLGS S LDWDRACFTMDP KLS AAVTEAF VR 
LHEEG 1 1 YRSTRIiVMWS CT1«NS AI SD I E VDKKELTGRTLLS V^G 
YKEKVE FGVLVS FAYKVQGSDS DEE VWATTR I ETM LGDVAVAV 
HPKDTRYQHLKGKNVIHPFLSRSLPIVFDEFVDMDFGTGAVKIT 
PAHDQNDYEVGQRHGLEAISIMDSRGALINVPPPFLGLPRFEAR 
KAVLVALKERGLFRG I EDNPMWPLCNRSKDWE PLLRPQW YVR 
CGEMAQAASAAVTRGDLRILPERHQRTWHAWMDNIRE\WCMFPG 
KLWWG \ HR \ I PAY FVTVSD PAV P PGED PDGR YWVSGRNEAEARE 
KAAKEFGVS PDKI SLQQDEDVLDTWFSSGIiFPLS ILGWPNQSED 
kpvr * ru 1 -i yjttvjL lit- 1 WVAKnVMlAaLKLTGRLPFREVYLHA 
IVRDAHGRKMSKSLGNVIDPLDVIYGISLQGLHNQLLNSNLDPS 
EVEKAKEGQKADFPAGIPECGTDALRFGLCAYMSQGRDINLDVN 
RILGYRHFCNKLWNATKFALRGLGKGFVPSPTSQPGGHESLVDR 
WIRSRL TBAVRLSNQG FQA YDFPAVTTAQ YS FWL YEI>CD VYLEC 
LKPVLNGVDQVAAECARQTLYTCLDVGLRLLSPFMPF^/TEELFQ 
RLPRRMPQAPPSLCVTPYPEPSECSWKDPEAEAALELAIjSITRA 
VRP \ LRAD YNIiHPE SG PTCFLE VAD\ EATGALASAVSGYVQGPG 
QAQVWAVAEPWGLPAP\QGCAVALASDRCSI\HLQLQG\LLDP 
ARELG\KLQ\AKRVEAQ\RQAQ\RLR\ERRA\ASGNPVKVPL\E 
VQEADEAKLQQTEAELRKVDEAIALFQKML 




2674 


S14 f 

1 - 
] 

I 


gpitflkkkakmkdmplrihvllglaittlvqavdkkvdcprlc 
tceirpwftprsiymeastvdcndlglltfparlpantqilllq 
twniakieystdfpvwltgldlsqnnlssvtningkkmpqllsv 

YLEENKLTELPEKCLSELSNLQELYINHNLLSTISPGAFIGLHN 
LLRLHLNSNRLOM INS KW FDALPNLE IltM I GEN PI I R I KDMNFK 
PLINLRSLVIAGINLTE I PDNALVGLENLES IS FYDNRLI KVPH 
VALQKVAWLKFLDLNKNPINRIRRGDFSNMLHLKELGINNMPEI* 
I S IDSIAVDNLPDLRKIEATNNPRLSYIHPNAFFRLPKLESIjMIf 
MSNALSALYHGTIESLPNLKEISIHSNPIRCDCVIRWMNMNKTN 
IRFMEPDSLFCVDPPEFOGONVRQVHFRDMMEICLPLIAPESFP 
3NLNVEAGS YVS FHCRATA\EPQPE I YWITPSGQKLLPNT\LTP 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid, segment containing signal peptide 
(A=Aianine, c=Cysteine, D=»Aspartic Acid, E= 
Glutamic Acid, P= Phenyl al anine , G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
Ls=I»eucine, M=Methionine, N=Asparagine, 
P=Proline, Q-Glutamine , R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W^Tryp t ophan , Y= Tyro sine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KFYVHS EGTLD INGVTP KEGGLYTCI ATNIiVGADLKSVMI KVDG 
S FPQDNNGSLNI KI RDIQANSVLVS WKASSKILKSSVKWTAF VK 
TENSHAAQSAR I PSDVKVYNLTHLNPSTE YKIC1 D I PTI YQKNR 
KKCVNVTTKGLHPDQKEYEKNNTTTLMACLGGLLGI IGVICLI S 
CLSPE^CDGGHSYVRNYLQKPTFALGELYPPLIWLWEAGKEKS 
TSLKVKATVIGLPTNMS 


6027 


5254 


4148 


GGRRAPGR PGRS I KDEEE ET VFRE WS FS P DPLP VR YYDKDTTK 
PISFYLSSLEEIiLAWKPRLEDGFNVALEPIACRQPPLSSQRPRT 
LLCHDMMGGYLDDRFIQGSWQTPYAFYHWQCIDVFVYFSHHTV 
T I PPVGWTNTAHRHGVCVLGTF I TEW NEGGRLCEAFLAGDERS Y 
QAVADRLVQIT\RFFRFDGWLINIENSLSLAAVGNMPPFLRYLT 
TQLHRQVPGGLVLWYDSWQSGQLKWQDELNQHNRVFFDSCDGF 
FTNYEOWREEHLERMLGQAGERRADVYVGVDVFARGNVVGGRFDT 
DKVGGG FRPRASGP V P PLG PH FLMDLP FP S APQRNDS S CSS QSG 
DPVALRNRCPAPAKLCPH 


6028 


120 


3432 


NCbLLQAKGFHGEIEDLQQWLTDTERHIiLASKPIiGGLPETAKEQ 
U^HMEVCAAFEAKEETYKSLMQKGQQMLARCPKSAETNIDQDI 
NNt.KEKWESVETKl>NER\KT\KLEEALNLA\MEFHNSL\QDFIN 
frTLTQAEQTLNVASRPSLILDTVLFQIDEHKVFANEVNSHREQII 
ELDXTGTHUKYFSQKQD WL I KNLLIS VQSRWEKWQRLVERGR 
SLDDARKRAKQ FHEAWS KLMEWLBESEKSLDSEIiEIANDPDKI K 
TQLAQHKE FQKS W3AKHS VYDTTNRTGRS LKEKTSLADDNLKIiD 
DMLSELRDKWDTI CGKS VERQNKLE EA\ LIiFSGQFTDALQAJj I D 
WLYRVEPQIiAEDQP VHGD I DLVMNliI DNHKAFQKEIX3KRTSSVQ 
ALKRSARELI EGSRDDS S WVKVQMQELSTRWETVCALS I SKQTR 
LEAAI^QAEEFHSVVHALLEWLAEAEQTLRFHGVLPDDEDALRT 
LIIX^HKEFMKKLEEKRAEIjNKATTMGDTVLAICHPDS 1TTI KHW 

nz irarfeeviiAWAkqhqqriasalagl iakqelleallawlq 
WASTTIjTDKDKEVI pqe i eevkaliaehqtfmeemtrkqpdvdk 

VTKTYKRRAADPSSLQSHIPVIjDKGRAGRKRFPASSLYPSGSQT 
Q I ETKNPRVNLLVSKWQQ VWI»LAIjERRRICLNDAIiDRIjEELREFA 

nfdfdiwrkkymrwmnhkksrvmdffrridkdqdgkitrqefid 
gilsskfptsrlemsavadifdrdgdgyidyyefvaalhpnkda 
ykpitdadkiedevtrqvakckcakrfqveqigdnkyrfflgnq 
fgdso^lrlvrilrstvmvrvgggwmaldeflvkndpcrakgrt 
nkblrekfiiadgasqgmaafrprgrrsrps srgaspnrsts vs 
sqaaqaaspqvpatttpki lhpltrn ygkpwltnskmstpckaa 
ecsdfpvpsaegtpiqgsklrbpgylsgkgfhsgedsglittaa 
arvrtqfadskktpsrpgsragskagsrassrrgsdasdfdise 

IQSVCSDVETVPQTHRPTPRAGSRPSTAKPSKIPTPQRKSPASK 
LDKSSKR 


6029 


1 


3533 


IMPCGSSRI/LRGCWTHPNEPVSDLS YFDCIESVMENS KVLGESM 
AGISQNAKTGDLPAFGECVGIASKALCGLTEAAAQAAYliVGIFD 
PNSQAGHQGIjVDP I QFARANQAI QMACQNLVDPGSS PSQVLS AA 
T I VAKHTS ALCNACRI AS S KTANPVAKRHFVQS AKEVANSTANIj 
VKTIKALDGDFSEDNRNKCRIATAPLIEAVENIiTAFASNPEFVS 
I PAQISSEGSQAQEP ILVSAKPMLES SS YLIRTARSIiAINPKDP 
PTWSA^IiAGHSHTVSDSIKSLITSIRDKAPGQRECDYSIDGINRC 
IRDIEQASLAAVSQSriATRDDISVEALQEQLTSWQEIGHLIDP 
IATAARGEAAQLGHKGTQLASYFEPLILAAVGVASKILDHQQQM 
TVLDQTKTLAESALQMLYAAXEGGGN PKAQHTHDAI TEAAQLMK 
EAVDDIMVTLMEAAS EVGIiVGGMVDAlAEAMSKLDEGTPPEPKG 
TFVDYO/TTWKYS KAIAVTAQEMMTKS VTNP EELGGIiASQMTSD 
YGHLAFQGQMAAATAEPEE I GFQ I RTRVQDLGHGC I FLVQKAG\ 
ALQVCPTDSYTKRELl ECARAVTEKVSLVLSALQAGNKGTQACI 
TAATAVSGI I ADU3TTIMFATAGTLNABNSETFADHRENILKTA 
KALVEDTKLLVSGAAST PDKIAQAAQSS AATI TQLAE VVKLGAA 
SLGSDDPETQWIilNAI KDVAKALSDLI SATKGAASKPVDDPSM 
YQLKGAAKVMVTNVTSLLKTVKAVEDEATRGTRALBATIECI KQ 
ELTVFQSKDVPEKTSS PEES IRMTKGITMATAKAVAAGNSCRQE 
DVIATANLSRKAVSDMIjTACKQAS FHPDVSDEVRTRAIiRFGTEC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine. D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H*=Histidine, Ialsoleucine, K=Lysine, 
L=Leucine f M^Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 

S = Serine T-Thrponi np V-Ual itio 

W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLGYLDLLEHVLVILQKPTPELKQQLAAFSKRVAGAVTELIQAA 
EAMKGTEWVDPEDPTVXABTELLGAAAS I EAAAKKLEQLKPRAK 
P JCQADE TLDFEEQI I»EAAKS I AAATS ALVKSASAAQREL VAQG K 
VGS I PANAADDGQWSQGL I SAARMVAAATSSLCEAANAS VQGHA 
S EE KLI S S AKQ VAASTAQLLVAC KV KADQDS E AMRRLQAAGNAV 
KRASDNt.VRAAQKAAFGKADDDDVWKTKFVGGIAQI IAAQEEM 
LKK3RELEEARKKLAQIRQQQYKFLPTELREDEG 


6030 


3 


1777 


FPGRGSPALQLEVLI CLGLMGLERALNVLAPI FYRNI VNLLTEN 
APWNSLAWTVTS YVFLKFLQGGGTGSTGFVSNLRTFLW I RVQQF 
TS RRVEI«L I FS HLHELS LRWHI*GRRTGE VLR I ADRGTS SVTGL I> 
SYLVFNVI PTLADI I IGI I YFSMFFNAWFGLI VFLCMSLYLTLT 
I WTE WRT KFRRAMNTQENATRARAVDSLLN F ETVKYYNAE S Y E 
VERYREAI I KYQGLEWKSSASLVU^O/TQNLVrGLGLIAGSLLC 
AYFVTEQKLQVGDYVLFGTYIIQLYl-IPLNWFGTYYRMIQTNFID 
MENMFDLLKK\3TEVKDLPGAGP FRFQKGRI EFENVHFS YADGR 
BTLQDVSFTVMPGQTLALVGPSGAGKSTI LRLLFRFYDI SSGCI 
RIDGQDISQVTQALFRFSHWELCPKDTVLFNDTIADNIRYGRVT 
AGNDE VE AAAQ AAG I HDAI MAF PEG YRTQVGERGLKIjS GGEKQR 
VAIARTILKAPGIILLDEATSAl^TSNERAIQASLAKVCAKTRTT 
IWAHRLSTWNADQILVI KDGC I VERGRHEALLS RGG VYADMW 
QLQQGQEETS EDTKPQTMER 


6031 


160 


1694 


LRMSENLDKSNVNEAGKSKSNDSEEGLEDAVEGADEAIjQKAIKS 
DSS S PQRVQR PHSSPPRFVTVEELLETARGVTNMALAHE I WNG 
DFQI KP VELPENSLKKRVKE I VHKAFWDCLSVQLSEDPPAYDHA 
I KLVGE I KETLL S FLLPGHTRLRNQ I TE VLDLDL I KQE AENGAL 
DIS KLAEFI IGMMGTIiCAPARDEE VKKLKDI KEI" VPLFRE1 FS V 
LDI^KVDMANFAISSIRPHLMQQSVEYERKKFQEXLERQPWSLD 
FVTQWLEEASEDLMTQKYKHAbPVGGMAAGSGDMPRIjSPVAVQN 
YA YL KL.LKWDHLQRPF PETVLMDQSR FHELQLQ \REQLT I LGAV 
LLVTFSMAAPG I S SQADFAEKLKM 1VK I LLTDMHL PSFHIiKDVL 
TTIGEKVCLEVSSCLSLCGS S PFTTDKETVLKGQIQAVAS PDDP 
IRRIMESRH,TFLETYItASGHQKPI/PTVPGGI»SPVOREI J BEVAI 
KFARLVNYNKMVFCPYYDAI IiSKILVRS 


6032 ■ 


39 


2415 


AARLCRAQPTKSAWMIRDIiSKMYPQTRHPAPHQPAQPFKFTISE 
SCDRI KEEFQFIXJAQYHSLKLECEKLASEKTEMQRHYVMYYEMS 
YGLN I EMHKQAEI VKRLNAI CAQVI P FLS QEHQQQ WQAVERAK 
QVTMAELNAI IGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAI P 
P IGS S AG LLALS SALGGQS HI>P I KDEKKHHDNDI IQRDRDS I KSS 
SVSPSASFRGAEKHRNSADYSSESKKQKTEEKEIAARYDSDGEK 
SDDNh WD VSNEDPSSPRGS PAHSPRENGLDKTRLLKKDAP ISP 
ASIASSSSTPSSKSKEt>SLNEKSTTPVSKSNTPTPRTDAPTPGS 

WQ*T*T3/"5T.'P OTT'Df'^ VODf'^TT'b'DY ROOT DimuximnnvnmnrviTimTiin 

noiruiiKrvi^RrFbVUt'LAooijKl PMAVPt-P iPTPFGI VPHAG 
MNGELTS PGAAYAGLHNIS PQMSAAAAAAAAAAAYGRS PWGFD 
PHHHMRVPAI PPNLTGI PGGKPAYSFHV5ADGQMQPVPFPPDAI* 
IGPG IPRHARQINTLNHGEVVCAVTISNPTRHVYTGGKGCVKVW 
DISHPGNKS PVSQIiDCLNRDNYI RSCRIiLPDGRTLI VGGEASTL 
SIWDLAAPTPRIKAELTSSAPACYAIiAISPDSKVCFSCCSDGNI 
AVWDLHNQTLWQFOXSHTDGASCIDISNDGTKLWTGGLDirrVRS 
W\DLREGRQLQQHD/ FFTSPVFSLGYCP \TEEWIiAVGr4ENSN\ V 
EVX,HVTKPDKYQI*HLHESCVLSIiKFAHCGKWF\VSTGKDNLLNA 
W\RTPYG\ASIF\QSKESSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6033 


39 


2415 


AARLCRAQPTKSAWM I RDLS KMYPQTRHPAPHQPAQPFKFTI SE ' 
SOJRIKEEFQFLQAQxlJSLKLECEKIiASEKTEMQRHYVMYYEMS 
YGLNIEMHKQAEI VKRLNA I CAQ V I P FLSQEHQQQ VVQ AVERAK 
OVTMAELNAI IGQQQLQAQHI*SHGHGIjPVPI>TPHPSGLQPPAI P 
PIGSSAGIJLALSSAH^QSHLPIKDEKKHHDNDHQRDRDSIKSS 
S VS PSAS FRGAEKHRNSAD YS S ES KKQKTEEKE I AAR YDSDGE K 
SDDNLWDVSNEDPSS PRGSPAHSPRENGLDKTRLLKKDAPIS P 
AS I AS SS S TPSS KS KE LS LNEKS TTP VS KSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASS1*RTPMAVPCPYPTPFGIVPHAG 



433 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidinc, I=Isoleucine, K=:Lysine f 
L=Leucine, M-Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, v= Valine, 
W-Tryptophan, Y=Tyrosine, X ^Unknown, **=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WNGELTSPGAAYAGIiHNISPQMSAAAAAAAAAAAYGRSPVVGFD 
PHI H1MRVPAI PPNLTG I PGGKPAYSFHVS ADSQjMQPVPFP PDAI* 
IGPGI PRHARQ I NTLNHGE WCAVT I SNPTRHVYTGGKGCVKVW 
DISHPGNKSPVSQLDCLNRDNYIRSCRLLPDGRTLIVGGEASTI* 
SIWDIiAAPTPRIKAELTSSAPACYALAISPDSKVCFSCCSDGNI 
AVWDbHNQTLVRQFQGHTDGAS C I DI SNDGTKLWTGGLDNTVRS 
W\DLREGRQLQQHD/ FFTS PVFSLGYCP\TEEWLAVGMENSN\ V 
EVLHVTKPDKYQI^LHESCVLSL.KFAHCGKWF\VSTGKDNLIiWA 
W\RTPYG\ASIF\QSK3SSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6034 


2683 


714 


ESGRRRRI>KRRRSPCPGTAGGPGETNPGPGACPRGPREEAAAAM 
EIAPQEAPPVPGADGDIEEAPAEAGSPSPASPPADGRLKAAAKR 
VT F PSDEDI VSGAVE PKDPWRHAQNVTVDE VIGAYKQACQKLNC 
RQIPKLLRQLQEFTDLGHRLDCLDLKGEKLDYKTCEALEEVFKR 
LQFKWDLEQTNL.DEDGASALFDM I EYYESATHLN ISFNKHIGT 
RGWQAAAHMMRKTSCLQYL\DARNTPI.LDHSAPFVARAIiRIRSS 
LAVLHLENASLSGRPLMLI^TAIjKMNKNIjRELY1i\ADNKI*NGIjQ 
DSAQLGNLL KFNCS LQI LDLRNNH VLD SGLAY I CEGLKEQRKGI* 
VTL\VLWNNQl»THTGMAFLGMT3bPHTQSIiETLNIX3HN P IGNEGV 
RHLKNGLI SNRS VLRLGLASTKLTCEGAVAVAEFI AES PRLLRL 
DLRENE I KTGGLMALS LALKVNHS LLRLDIiDREPKKEAVKS FI E 
TQKALtJ^IQNGCKRNLVLAREREEKEQPPQLSASMPETTATEP 
QPDDEPAAGVQNGAPSPAPSPDSDSDSDSDGEEEEEEEGEREET 
PSGAI DTRDTGSSE PQPPPEPPRSGPPLPNGLKPEFALAIjPPEP 
PPGPEVKGGSCGLEHELSCSKNEKELEELLLEASQESGQETL 


603S 




404 


S VTYLGI I LHKNTGALPADPVQI>I SQTPTPSTJCQQLLSFLGMVG 
YF YLWI PG FAI LTKPLCKLiTKENI*ADA I DP KSFSHSS FRS LKTA 
LENASTLALPDSSQPF\SLHTAEVQGCWE 1 LTQGLGPLP V 


6036 


1745 


356 


LPDVEKLGRRRGRKMDSVEKGAATSVSNPRGRPSRGRPPKIjQRN 
SRGGQGRGVEKPPHIAALI LARGGSKGI PLKNI KHLAGVPLIGW 
VLRAALDSGAFQSVWVSTDHDEIENVAKQFGAQVHRRSSEVSKD 
S STS L»DAI 3 E FLN YHNEVD I VGN I OATS PCLHPTDLQKVAEMIR 
EEG YDS VFS WRRHQFRWS EI QKGVRE VTE PLNLNPAKRPRRQD 
WDGEIiYENGSFYFAKRHLIEMGYLQGGKMAYYEMRAEHSVDIDV 
DIDWP IAEQRVLRYGYFGKEKLKE I KjLIjVCN I DGCLTNGH I YVS 
GDQKEI ISYDVKDAIGI SLLKKSGIEVRLISERACSKQTLSSLK 
LDCKMEVS VSDKLAVVDEWRKEMG IjCWKEVAYIiGNEVSDEE CIiK 
RVGLSGAPADACSTAQKAVGYICKCNGGRGA\ IREFAEHI C\LIi 
MEKGI*INFMPKNRNLAVNIGEKK 


6037 


2936 


1919 


WTSVJWMSSVLTILLiFSLQGNKMLtNYSAPSAGGYIiIiPRKPVGTPA 
GGGFPRRHSVTbPSSKFRQNQLLSSLKGEPAPALSSRDSRFRDR 
SFSEGGERLLPTQKQPGGGQVNS SRYKT\ ELCRPFEENGACKYG 
DKCQFAHGIHELRS LTRH P KYKTELCRTFHTIG FC P YGPRCHFI 
HNAEERRAIiAGARDIjSADRPRLQHS FS FAGFPS AAATAAATGLL 
DS PTS ITP P P I LSADDLIX5S PTLPDGTNNPF\AFSSQELAS IiFA 
PSMGLPGGGS PTTFLFRPMS ES PHMFDS P PSPQDSLSDQEG Yt*S 
SSSSSHSGSDSPTIjDNSRRLP I FSRLS I SDD 


6038 


1450 


426 


SSALQEFGTRNHTFGVPLPHRRKQI I SCNICQLRFNSDSQAAAH 
YKGTKHAKKLKALEAMKNKQKS VTAKDS AFCTTFTS IT TNT INTS 
SDKTDGTAGTPA ISTTTTVEI RXSS VMTTEITS KVEKSPTTATG 
NSSCPSTETEEBKAKRIJL\YCSLCKVAVNSASQIiEAHNSGTKHK 
TMLEARNGSGTI KAF PRAG VKG KGPVNKGNTGLQNKT FHCE I CD 
VHVNSETQLKQHISSRRHKDRAAGKPPKPKYSPYNKLQKTAHPL 
GVKLVFSKEPSKPIAPRILPNPIiAAAAAAAAVAVSSPFSLRTAP 
AATLFQTSALPPALLRPAPGPIRTAHTPVLFAPY 


6039 


4073 


1000 


LDEYEARLTLAKLDDFEEDNEDDDENRVNQEEKAAKITELINKL 
NFLDEAEKDLATVNSNPFDDPDAAELNPFGDPDSEEPITETASP 
RKTEDSFYNNSYNPFKEVOTPQYIiNPFDEPEAFVTIKDSPPQST 
KKKNIRPVDMSKYI>YADSSKTEEEELDESNPFYEPKSTPPPNNI* 
VNPVQBIiETERRVKRKAPAPPVIjSPKTGVLNENTVSAGKDLSTS 
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BNSDOCID: <WO. 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

1 amino acid 

1 sequence 


Amxno acia segment containing signal peptide"! 
(A=Alanine, C=Cyateine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
^Leucine, Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, r=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X~ Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\*=possible nucleotide insertion) I 


6040 






^KPSPIPS^VLGRKPNASQSl,LVW(^VTKNYRGVKITMFTTSwH 
RNGLSFCAILHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 
SRLLEPS DM VLLAI PDKIiTVMT YL YQ I RAHFSGQELNVVQ IEEN 

SSKSTYKVGNYETDTNSSVDQEKFYAELSDLKREPELQQPISGA 
VDFLSQDDSVFVNDSGVGESESEHQTPDDHLSPSTASPYCRRTK 
SDTEPQKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKKRLLKA 
ETLEI.SDLYVSDKKKDMSPPFICEETDEQKLQTLDIGSNLEKEK 
LEWSRSLE CRSDPE S P I KKTS LS PTS KX*G YS YS RDLDLAKKKHA 

SLRQTESDPDADRTTLNHADHSSKIVQHRLLSRQEELKERARVI> 
liEQARRDAALKAGNKHNTNTAAPFCNRQLSDQQDEERRRQLRER 
ARQLIAEARSGGKMSELPSYGERAAEKLKERSKASGDENDNIEI 
DTNEE I PEGFWGGGDELTNLENDLDTPEQNSKI*VDLKLKKLLE 
VQPQVANSPSSAAQKAVTESSEQDMKSGTEDLRTERLQKTTERF 
RKPWPS KDS T VRKTQLQS FSQ Y I ENRPEMKRQRS IQEDTKKGN 
kfcKAAI TETQRKPSEDEVLNKGFKDS \ SQYWGEIAAIjENEQKQ 
IDTRAALVEKRLRYT.MDTGRNTEEEEAMMOEWFMLVNKKNALIR 
R^QLSLI^KEHDI^RRYEbl^RELRAMIiAIEDWQKTEAQKRRE 

QLLIiDEZ>VALVNKRDALVRDIiDAQEKQAEEEDEHLERTliEQNKG 
KMAKKEEKCVLQ 


6041 


475 


F~ 1052 


ptalmtapscafpvqfrqpsvsglsqitkslyisngvaannklmH 

LSSNQITMVINVSVEWNTLYEDIQYMQVPVADSPNSRLCDFFD 

piadh±hsvemkqgr\tllhcaagvsrsaalciaylmkyhamsl 

LDAKTWTKSCRPHRPNSGFWEQLIHYEFQLFGKNTVHMVSSPV 
GMIPDIYEKEVRLMIPL 1 


6042 


2 


3886 

_ 

1 


TEKUKKTAHNLENVLIHFVJERLSEICVAKISKPEADVESVLGVS 
NLIiCVLQKPKGSLKSSKKKNGKVRFADEILESNKENEKCVSSEG 
EKIECWELTTEPSLTHNSSGLLSPLRKKPLEDLVCKLADISINY 
VNERKSEQHLRFLSTLLDSFSSSRVFKMLLGDEKQSIVQAKPLE 
IAKLVQKNPAVQFLYQKLIGWLNEDQRKDFGFLVDILYSALRCC 
DNDMERKKVLDDLTKVDLKWNSLLKIIEKACPSSDKHAIjVTPWI> 
KGDILGEKLVNI*ADCIiCNEDLESRVSSESHFSERWTlit»SLVIjSQ 
HVKND YIjI GDVYVE R 1 1 VRLHETXjFKTKKLS EAESS DSSVSFIC 
DVAYNYFSSAKGCLLMPSSEDLLLTLFQLCAQSKEKTHLPDFLI 
CKLKNTWIjSGVNLLVHQTDSS YKESTFLHI*S ALWI»KNQVQAS S L 
DI NSIjQVLIjSAVDDLLNTLLES EDS YXMGVYI GS VMPNDS E WEK 
MRQSLPMQWLHRPLLEGRLSLNYECFKTDFKEQDIKTLPSHIiCT 
SALLSKMVLIALRKETVLENNELEKIIAELIiYSLQWCEEI^DNPP 
IFLIGFCEILQKMNITYDNLRVLGbJMSGLLQLLFNRSREHGTLW 
SLIIAKI,ILSRSISSDEVKPHYKRKESFFPLTEGNLHTIQSLCP 

flskeekkefsaqcipai^wtkkdlcstnggfghiaifnsclq 

TKS I DDGELLHGI L K 1 1 1 S WKKEHEDI FI*FS CNLSEAS PEVLG V 
NIEriRFLSLFUCYCSSPLAESEWDFIMCSMLAWLETTSENQAL 
YS I PL VQLFAC VS CDLACDLS AF FDS TTLDTIGNLPYNL I SE WK 

EFFSQGIHSLLLPILVTVTGENKDVSETSFQNAMLKPMCETDTY 
ISKEQLLSHKLPARLVADQKTNLPEYLQTLLNTLAPLLLFRARP 
VQIAVYHMLYKLMPELPQYDQDNLKSYGDEEEEPALSPPAALMS 
I*LSIQEDLLENVLGCIPVGOIVTIKPLSEDFCYVJjGYI>LTWKI*I 
LTFFKAASSQLRALYSMYUIKTKSLNKLLYHLFRLMPENPTYAE 
TAVEVPNKDPKTFFTEELQLSIRETTMLPYHIPHLACSVYHMTX, 
KDLPAMVRLWWNS S EKRVFNI VDRFTS KYVS S VLS FQE I SS VOT 

STQLFNGMTVKARATTREVMATYTIEDIVIELirQLPSNYPLGS 
1 1 VESG KRVG VAVQQWRNWMLQLS TYLTHQNGS IMEGLALWKNN 

/DKRFEGVEDCMICFSVIHGFNYSLPKKACRTCKKKFHSA\CLY 
<WFTSSNKSTCSLCRETFF 








1306 


253 t 
C 
C 
I 

c 

Q 


1AEIAPASPSD1KASVSNGDTTLLCSRRQSCGMNEVRQVSLTYP 
5SPAPSHSLPLQPRSGGSLCPSRAW/PDPHQLFDDTSSAQSRGY 
?AQ RAPGGLS YPAAS PTPHAAFLADP VSNMAMAYGS SLAAQGKE 
jVDKNIDRFIP ITKLKYYFAVDTMYVGRKLGLLFFP YLHQDWEV 
JYQQDTPVAPRFDVNAPDLYIPAMAFITYVLVAGIALGTQDRFS 
'DLLGLQASSA^AWLTLB VLAI LLS L YIiVTVNTDIjTTIDLVAFI* 
-YKYVGM IGGVLMGLLFGKIG YYLVLGWCCVAI FVFMIRTLRLK 
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SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=3erine, T=Threonine, V=Valine, j 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ILADAAAEGVPVRGARNQLRMYLTMAVAAAQPMUIYWLTFHLVR 


6043 


403 


599 


LCLFFPFPCATPVLPLPSL1SAI./CLSHLSVSSWFCPCQPPLPC 
P1>P PLQN KTAKGS LSTEQSERG 


6044 


793 


412 


KLEMWNFT1»I SKVKI SREVTMI ASKFG I GQQVRHSLLG YLGVW 
DIDPVYSLSEPSPDELAVNDELRAAPWYHWMEDDNGLPVHTYL 
AEAQLSSELQDEHP\EQPSMDEIjAQTIRKQIjQAPRIiRN 


6045 


155 


2299 


SPLPQVAAMNYIiRRRIjSDSNFP^LPNGYPaTDLQRPQPPPPPPG 
AHS PGATPGPGTATAERSSGVAPAASPAAPSPGSSGGGGFFS SL 
SNAVKQTTAAAAATFSEQVGGGSGGAGRGGAASRVLLVIDEPHT 
DWAKYFKGKK IHGEIDI KVEQAEFSDLNLVAHANGGFS VDME Vh 
RNG VKWRS LKPDFVLIRQHAFSMARNGDYRSI.VIGLQYAGI PS 
VNSLHSVYNFCDKPWVFAQMVRLHKKLGTEEFPLIDQTFYPNHK 
EMLSS\TTYPVWKMGHGTLWGWGKVKVDNQHDFQDIASWALT 
KTYATAEPFIDAKYDVRVQKIGQNYKAYMRTSVSGNWKTNTGSA 
MLEQIAMSDRYKIjWVDTCSEIFGGLDICAVEALHGKDGRDHIIE 
WGSSMPLIGDHQDEDKQbIVELWNia*4AQALPRQRQR n ASPGR 
GSHGQTPSPGAJLPLGRQTSQQPAGPPAQQRPPPQGGPPQPGPGP 
QRQG P PLQQR PP PQGQQHLSGLGP P AGS PLPQRLPS PTS APQQP 
ASQAAPPTQGQGRQSR PVAGGPG APPAARP PAS PS PQRQAGP PQ 
ATRQTS VSG PAP PKAS GAPPGGQQRQGP PQKP PGPAG PTRQASQ 
AGPVP3TGPPTTQQPRPSGPGPAGRPKPQLAQKPSQDVPPPATA 
AAGGPPHPQIiNKSQSLTNAFNtiPEPAPPRPSJjSQDEVKAETIRS 
LRKSFASLFSD 


6046 


212 


1075 


egltgpcerv p fl»lg rg p phgatraghrravr wag pes lpplpr 
slimdspragthqgpldaetevgadrctstayqeqrpqveqvgk 
qapi>s pglpamggpgpgp cedp agaggagaggse pl vt vtvqca 
ftvalrarrgadlsslrallgqalphqXaqlgqlsyiapgedgh 
wvp i pe 2es lqrawqdaaacprglqlqcrgaggrpvl yqwaqh 
s ys aqg pedlgfrqgdtvdvlcevdqawl»eghcdgr ig i fp kcf 
w pag prmsgapgrlprs qqgdqp 


6047 




14 05 


PVIjVTS LRNIREADTliRPPQLMEVSADI I STVEFNHTGEI»I*ATGD 
KGGRWIFQREPESKNAPHSQGEYDVYSTFQSHEPEFDYLKSLE 
IEEKINKIKWLPQQNAAHSLLSTNDKTIKLWKITERDKRPEGYN 
LKDBEGKLKDLSTVTSIiQVP VLKPMDLMVEVS PRRI FANGHTYH 
INS I S VNSDCETYMSADDIiRINLWHIAITDRS FTP\NI VDIKPA 
NMEDLTEVITASEFHPHHCNLFVYSSSKGSLRLCDMRAAALCDK 
HSKLFEEPEDPSNRSFFSEIIS\SVSDVKFSHSDRYMLTR\DYL 
TVKVWDL\NMEARPIKTYQVHDYLRSKI>CSLYENDCI FDKFECA 
WNGSDSV I MTGA\YNNFFRMFDRNTKRDVTL\EASRES SKPRAV 
LKPRRVC VGGKRRRDD I S VDSLDFTKKI LHTAWHP AEN I IAIAA 
TNNLYI FQDKVNSDMH 


6048 


1 


3194 


GIRTPKFCDSPTSDLEMRNGRGRGKRMRPNSNTPVNETATASDS 
KGTSNSS KTRAGANSKGRRGSQNSSEHRP PASSTS EDVKAS PS S 
ANKRKNKPLSDMELNSSSEDSKGSKRVRTNSMGSATGPLPGTKV 
EPTVLDRNCPS P VL I DCPH PNCNKKYKH I NGLKYHQAHAHTDDD 
SKPEADGDSEYGEEPILHADLGSCWG\ASVSQK\GSLSPARSAT 
PKVRLVEPHSPSPSSKFSTKGLCKKKI^GEGDTDLGALSNDGSD 
DGPSVMDETSNDAFDSLERKCMEKEKCKKPSSLiKPEKIPSKSIiK 
SARP I / APIA I P PQXJ I YTFQTATFTAAS PGSSSGLTATVAQAMP 
NSPQLKPIQPKPTVMGEPFTVNPALTPAKDKKKKDKJCKKESSKE 
LESPLTPGKVCllAEEGKSPFRESSGNGMKMEGIjIiNGSSDPHQSR 
IiASIKAEADKIYSFTDNAPSPSIGGSSRLENTTPTQPLTPLHW 
TQJJGAEAS S VKTNS PAYSDISDAGEDGEGKVDSVKS KDAEQLVK 
EGAKKTLFPPQPQS KDSPYYQGFESYYSPS YAQSS PGALNPSSQ 
AGVESQALKTKRDE EPES I EGKVKND I CEE KKPEJbS S SSQQPS V 
IQQRPKMYMQSLYYNQYAYVPPYGYSDQSYHTHLJjSTNTAYRQQ 
YEEQQKRQSIjEQQQ RG VDKKAEMGLKEREAALKEEWKQKPS I p p 
TLTKAPSLTDLVKSGPGKAKEPGADPAKSVI I PKLDDSSKLPGQ 
AP EGLKVKLSDASHLSKEAS EAKTGAE CGROAEMDP I LWYRQEA 
EPRMWTYVYPAKYSDIKSEDERWKEERDRKLKEERSRSKDSVPK 
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WO 01/53312 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Pxredi a t ed pnH 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H-Histidine, I=Ieoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine , R=Arginine, 

**■ » a — i in coninc , valine, 
W=Tryptophan, Y -Tyro sine, x=Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDGKESTSSDCKLPTSEESRIX3SKEPRPSVHVPVSSPLTQH0SY 

IPYMHGYSYSQSYDPNHPSYRSMPAVMMQNYPGSYLPSSYSFSP 

YGSKVSGGEDADKARASPSVTCKSSSESKALDILQQHASHYKSK 
S PTISDKT^OKRnPf5f^OfTinrr , /~ , r'r'or>oc'iT/->/-»7vr»^»-i^>T-»«-iTrTN« 

■ L ^ AV A J v i ^^A-'x\vj\3\-Vj v VLjv^LiLjtjl_i>i3 VGGASGGERSVDRPRT 

SPSQRLMSTHHHHHHU5YSLLPAQYKLPYAAGLSSTAIVASQQG 
STPSLYPPPRR 


6049 
6050 


215 


1089 


AMTGVFDRRVPS IRSGDFQAPFQTS AAMHH PSQES PTLPES SAT 
DSD YYS PTGGAPHGYCS PTS AS YG \ KALN P YQ YQYHG VNGS AGS 
YPAKAYADYSYASSYHQYGGAYNRVPSATNQPEKEVTEPEVRMV 
NGKPKKVRKPRTIYSSFQLAALQRRFQKTQYIiALPERAELAASL 
GLTQTQVKIWFQNKRSKIKKIMKNGEMPPEHSPSSSDPMACNSP 
QSPAVWEPQGSSRSLSHHPHAHPPTSKQSPASSYLENSASWYTS 

ATNCQTKTQTlr DDDT" C? T <*\ITnT i\r "fc rx-irrvr i« 

amo ;> j. n JbFFFCjrSLQHPLALAS GTL Y 


6051 


566 


1718 


KGLERTCCAMEBSDSBKTTEKENLGPRMDPPI^EPG\GSLGWVL 
PNTAMKKKVTjLMGKSGSGKTSMRSI I FANYIARDTRRLjGATIIjD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDT FMENYFTSQRDNI FRN VE VL I Y VFDVESRELEKDMH Y 
YQSCLEAI LQNSPDAK I FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLS RPLECS CFRTS I WDETLYKAWS S I VYQLI PNVQQLEMNLRN 
FAE 1 1 EADE VLLFERATFLVI SHYQCKEQRDAHRFEK I SNI I KQ 
FKI^CSKIAASFQSMEVRNSNFAAFIDIFTSNTYVMVVMSDPSI 
PS AATLIN I RNARKHFEKLERVDGP KQCLLMR 


6052 


566 


1718 


KGLERTCCAMEESDSEKTTEKENI,G?RMDPPLGEPG\GSLGWVI, 
PNTAMKKKVLLMGKSGSGKTSMRSI I FANYI ARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFU3NLVLNLW 
DCGGQDTFMENY FTSQRDN I FRNVEVLI Y VFDVESRELE KDMH Y 
YQSCIjEAI LQNSPDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECSCFRTS IWDETLYKAWS S I VYQLI PNVQQLEMNLRN 
FAEIIEADEVLLFERATFLVISHYQCKEQRDAHRFEKISNIIKQ 
FKLSCS KLAAS FQSME VRNSNFAAFIDI FTSNTYVMWMSDPSI 
PSAATLINI RNARKHFEKLERVDGP KQ CLLMR 




566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\GSLGWVL ' 
PNTAMKXKVLLMGKSGSGKTSMRS 1 1 FANYIARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDN I FRNVEVLI YVFDVESRELEKDMHY 
YQS CLEAI LQN S PDAK I FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECSCFRTS I WDETLYKAWSS I VYQLI PNVQQLEMNLRN 
FAEIIEADEVLLFERATFLVISHYQCKEQRDAHRFEKISNIIKQ 
FKLSCS KLAAS FQSMEVRNSNFAAF I D I FTSNT YVMWMS DPS I 
PSAATLINI RNARKHFEKLERVDGP KQCLLMR 


6053 
6054 


201 


1704 


KGTEMNKSRWQSRRRHGRRSHQQNPWFRLRDSEDRSDSRAAQPA ^ 
HDSGHGDDES PSTS SGTAGTS S VPELPG F YFDPE KKR YFRLLPG 
niN wv-xv t-u j. jus£> i Ky KEMES KRLRLLQEEDRRKKI ARMG FNAS S M 
LRKSQLGFLNVTNYCHLAHELRLSCMERKKVQIRSMDPSALASD 
RFNLILADTNSDRLFTVNDVTVGGSKYGI INLQSLKTPTLKVFM 
HENLYFTNRKV\NSVCWASLNHLDSHILLCLMGLAETPGCATLL 
PASLFVNSHPAGIDRPG\MLCS FRI PGAWSCAWSLNIQANNCFS 
TGLSRR VLL TNWTGHRQS FGTNSDVLAQQ FALMAPLLFNGCRS 
G 21 FAI DLRCGNQGKGWKATRL FHDSAVTS VRILQDEQ YLMASD 
MAGK1KLWDLRTTKCVRQYEGHVNEYAYLPLHVHEEEGILVAVG 

QDCYTRIWSLHDARLLRTIPSPYPASKADIPSVAFSSRLGGSRG 
APGLLMAVGQDLYCYS YS 




1 


1054 


P?I ARLQEFGTSRRHMAAPSG VHLLVRRGSHRI FSS PLNHI YLH 
KQSS SQQRRN FFFRRQRDI SHS I VLPAAVS S AHP VP KH I KKPDY 
VTTG I VP DWGDS I E VKNEDQ I QGLHQACQLARHVLLLAGKS LKV 
DMTTEE IDALVHREI I SHNAYPS PLGYGGFPKSVCTSVNNVLCH 
G I PDSRPLQDGDI INI D VTVYYNG YHGDTS ETFLVGNVDECGKK 
LVEVARRCRDEAIAACRAGAPFS VI GNT I SHI THQNGFQ VCPHF 
VGHG IGS YFHGHPE I WHHANDSDLPMEEGMAFT I E P 1 1 TEGS PE 
FKVLEDAWTWS LD/ TS KVS AQFEHT VX»I TSRGAQI LTKLPHEA 
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SEQ 
ID 
NO: 


1 Predicted 
1 beginning 
I nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *^stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 


605S 


421 


2364 


P PYFLLS FLAWWLYGQS DRTETDI SQSAGPPPGTTiQCSAIiHHDP 
GCANCSRFCRDCSPPACQCHTHVFPGNALNGVQPPEIjSRTLAIjI 
SS REP PR KKKKSQTETGKERERTSFLTQGGKRFELQHGLAG I CM 
TLL I TGD S I VS AE AVWDHVTMANREIiAFKAGDVI K VLDASNKD W 
W WGQIDDEEGW FPAS F VRLWVNHEDEVEEGPS DVQNGHLDPNSD 
CLCLGRPLQNRDQMRANVI NE I MSTERH Y I KHLKDI CEG YLKQC 
RKRRDMFSDEQLKVI FGN I ED I YR FQMG FVRDLE KQYNNDDPHL 
SEIGPCFLEHQDGFWIYSEYCNNHLDACMELSKLMKDSRYQHFF 
EACR LLQQH I D I A\ I DGFLLTP VQKI CKYPLQLAEI/LK YTAQDH 
SDYR YVAAALAVMRNVTQQ I NER KRRLEN 1 DK I AQWQAS VI*DWE 
GED I IiDRSS ELI YTGEMAW I YQP\ YGRNQQRVFFliFDHQMVLCK 
KDLIRRDILYYKGRIDMDKYEWDIEDGRDDDFNVSMKNAFKIiH 
NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGFEISE 
NQKRQAAMT VR KV PKQKG VNS ARS V P PS Y PP PQDPLKHGQ YI*VP 
\DG I AQ3QV FE FTEPKRSQ S PFWGNFSRLTPFKK 


6056 


43 


3358 


SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLSSLPPPPSRA 
LAPTRAPDTALTIMEVAEVESPLNPSCKIMTFRPSMEEFREFNK 
YliAYMESKGAHRAGIiAKVI PPKEWKPRQCYDDIDNLLI PAPIQQ 
MVTGQSGLFTQ YN I QKKAMTVKEFRQLANSGKYCTPRYLDYEDL 
ERKYWKNIjTFVAP I YGADINGSI YDEGVDEWNIARLNTVLDWE 
EECGISI EGVNTP YLYFGMWKTTFAWHTEDMDLYS INYLHFGEP 
KSWYAI PPEHGKRLERLAQGFFPS SSQGCDAFLRHKMTLISPSV 
LKKYG I ? FDKI TQEAGE FMITFPYGYHAGFNHGFNCAESTNFAT 
VRW I DYGKVAKLCTCRKDMVXISMD I F VRKFQPDR YQLWKQG KD 
I YTI DHTKPTPASTPEVKAWLQRRRKVRKASRS FQCARSTS KRP 
KADEEEE VSDB VDGAEVPNPDS VTDDL KVSEKSEAAVKLRNTEA 
SSEEESSASRMQVEQNLSDH I KLSGNSCLSTS VTEDIKTEDDKA 
YAYRSVPSISSEADDSIPIiSTGYEKPEXSDPSELSWPKSPESCS 
SVAESNGVLTEGEESDVESHGNG1.EPGEI PAVPSGERNS FKVPS 
I AEGENKTSKS WRH PLSRPPARS PMTLVKQQAPSDEELPEVLS I 

T?'RT« , \rP , 'PTII*CIW2WDl .TUT WnTlT'DDM'PA AETMTVMJvmr* nuvntT/"!* t 

LiiJiVLL x iu&wj\iLb'ij±rtLtWtj i, mst'isib AAc,sj a IJNATVAKMKPHGAI 
CTLLMPYHKPDSSNEENDARWE XKLDEWTSEGKTKPIjI PEMCF 
I YSEENIEYSPPNAFJLEEDGTSLIjI s cakccvrvhascygi psh 
E ICDGWLCARCKRWAWTAECCLCNLRGGALKQTKNNKWAHVMCA 
VAVPEVRFTNVPERTQIDVGRIPLORI>KLKCIFCRHRVKRVSGA 
CIQCSYGRCPASFHVTCAHAAGVL\MEPDDWPYWNITCFRHKV 
NPNVKSKACEKVI S VGQTV ITKHRNTRY YSCRVMAVTSQTFYE V 

AKYFGSNIAHMYQVEFEDGSQIAMKREDIYTLDEELPKRVKARF 
VSAGRCHLGTCQVNS LS S PHVSQAQQETYLGFW INSKXSQCNI F 
LSGTY 


6057 


1 


853 


FVAKLKEQEGEGGLGPRKEKGRARGRERRRKMQLTRCCFVFLVQ 
GSLYIiVI CGQDDGPPGSEDPERDDHEGQPRPRVPRKRGHIS PKS 
RPMANSTLLGLLAPPGEAWGI LGQPPNRPNHSPPPSAKVKKI FG 
WGDFYSNIKTVALNLLVTGKIVDHGNGTFSVHFQHNATGQGNIS 
I SLVPPSKAVEFHQEQQIFI EAKASKI FMC\RMEWEKVE\RGRR 
TSLFTHDPAKICSRDHAQSSATWSCSQPFKVVCVYIAFYSTDYR 
LVQKVCPDYNYHSDTPYYPSG 


6058 


1 


986 


H PLPSAS LGLPS VS IiG VSLC VRS ALLEAWPMLP KRRRARVGS P 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 
VLDACSS EATH WMEETS AEEAVS WQERRMAAAPPGCTP PALLD 
I SWLTES LG AGQPVPVECRHRLE VAGPS KGPLSPAWMPAYACQR 
P7PLTHHNTGLSEALE ILAEAAGFEGSEGRLLTFCRAASVLKAL 
PS PVTTLSQLQGLPHFGEHSSRWQELLEHGVCEEVERVRRSE / 
RIiFTQIFGVGVKTADRWYREGIiRTliDDIiREQPQKIiTQQQKAGBP 
SREAGPWASLNCTLDPSASTP 


6059 


2 


3650 


QQDFSSLADLTDHRAHRCPGDGDDDPQI>SWVASSPSSKDVASPT 
QMIGDGCDIjGLGEEEGGTGLPYPCQFCDKSFIRLSYLKRHEQIH 
SDKLP FKCTYCSRLFKHKRSRDRHI KLHTGDKKYHCHECEAAFS 

rsdhlkihlkthsssk^fkcivckrgfsstsslqshmqahkknk 
ehiajcsekfjuocddfmcdycedtfsqteelekhvltrhpqiisek 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, DsAspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V=V a line, 
"^Tryptophan, Y=Tyrosine, X=UnXnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADLQC IHCPE VF VDKNTLLAH I HQAHANQKH XCPMCPE \QFSS V 
\EGVYCHLDSHRQPDSSNHSVSPDPVLGSVASMSSATPDSSASV 
ERGSTPDSTLKPr^GQKKMRDDGQGWTKVVYSCPYCSKRDFNSL 
AVLEIHLKTIHADKPQQSHTCQICLDSMPTLYNLNEHVRKLHKN 
HAYPVMQFGNISAFHCNYCPEMFADINSLQEHIRVSHCGPNANP 
SDGNNAFFCNQCSMGFLTESSLTEH I Q\Q\AHCS VGSAKLESP V 
VQPTQSFMEVYSCPYCTNSPI FGS ILKLTKHI KENHKNIPLAHS 
KKS KAEQS P VSS DVEVS S PKRQ RLS AS ANS I SNG EYPCNQCDLK 
FSNFESFQTHLKLHLELLLRKQACPQCKEDFDSQESLLQHLTVH 
YMTTSTHYVCESCDKQFSSVDD\LQKH\LLDMPHPLCCTHCT\L 
CQE VFDS\ KVS I \QVHLAVKHSNE KKMYRCTACNWDFRKEADLQ 
VHVKHSHLGNPAKAHKCIFCX3BTFSTEVELQCHITTHSKKYNCK 
FCS KAPHA I ILLEKHLREKHCVFDAATENGTANGVPPMATKKAE 
PADLOGMLLKNPEAPNSHEASEDDVDASEPMYGCDICGAAYTME 
VLLQNHRLRDHNI RPGEDDGSRKKAE FIKGSHKCNVCSRTFFSE 
NGLREHLQTHRGPAKHYMCPI CGERFPSLLTLTEHKVTHS KSLD 
TGTCRI CKMPLQSEEEFI EHCQMH PDLRNS LTGFRCWCMQTVT 
STLELKIHGTFHMQKLAGS SAAS S PNGQGLQ KL YKCALCLKEFR 
SKQDLVKLDVNGLPYGLCAGCMARSANGQVGGLAPPEPADRPCA 
GLRCPE CS VKFES AEDLE SHMQVDHRDLTP ETSG PRKGTQTS PV 
PRKKTYQC1KCQMTFENEREIQIHVANHMIEEGINHECKLCNQM 
FDSPAJCIiLCHLIEHSFEGMGGTFKCPVCFTVFVQANKLQQHI FA 
VHGQEDKIYDCSQCPQKF FFQTELQWHTMSQHAQ 


6060 


2145 


202 


S YE I VGKNKLEVNHSQLKALCKCSLPSRLLPLGENLPLLDRGFR ~ 

KEPRSRGSRERDNMLHLHHSCLCFRSWLPAMLAVLLSLAPSASS 

DIS AS R PNTIiLLMADDLG I GD IGCYGNNTMRTPNI DRLAEDG VK 

LTQKISAASI.CTPSRAAFLTGRYPVRSGMVSSIGYRVLQWTGAS 

GGLPTNETTFAKILEEKGYATGLIGKWHLGLNCESASDHCHHPL 

HHGFDHFYGMPFS1^DC7^WELSEKRVNLEQKLNFLFQVIjAI*V 

ALTLVAGKLTHL I PVSWMPVI WSAI*SAVLbLASS YFVGALI VHA 

DCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRNKHGPFULFV 

SFLHVHIPLITMENFLGKSLHGLYGDNVKEMDWMVGRlLDTIiDV 

EGLSNS TLI Y FTS DHGGSLENQLX3NTQYGGWNG I YKGGKGMGGW 

EGGlRVPGIFRWPGVLPAGRVIGEPTSIiMDVFPTWRIiAGSEVP 

QDRVI DGQDI1LPLLI1GTAQHSDHEFLMHYCERFI1HAARWHQRDR 

GTMWKVHFVTPVFQPEGAGACYGRKVCPCFGEKWHHDPFKLFD 

LSRD PSETHILTPASE PVF YQVMER \ VQQAV WEHQRTLSPVPLQ 


6061 


110 


1330 


MWIHMKRKTIKNINTFENRMLMLDGMPAVRVKTELLESEQGSPN " 
VHNYPDMEAVPLLLNNVKGEPPEDSLSVDHFQTQTEPVDI>S INK 
ARTSPTAVSSSPVSMTASASSPSSTSTSSSSSSRLASS PTVI TS 
VSSASSSSTVLTPGPLVASASGVGGQQFIiHIIHPVpPSSPMNliQ 
SNKLS HVHR I PVVVQS VPVVYTAVRS PGNVNNTI WPLLEDGRG 
HGKAQMDPRGLS PRQSKSDSDDDDLPNVTLDSVNETGSTAJLS I A 
RAVQEYHPS PVSRVRGNRMNNQKFPCS ISPFSIESTRRQRTVLN 

v tr uoKix l/ilol UK-L>r \ HAjLA^IJ KJj X 1 Kb i> ^> PbKVHKK 1 HTGEKPY 

KCTWEGCTWKFARSDELTRHYRKHTGVKPFKCADCDRSFSRSDH 
LALHRRRHMLV 


6062 


71 


1079 


ETMAKNGPENCEI3CHILNAEAFKSKKICKSLKICGLVFGILAl/r " 
LIVLFWGSKHFWPEVPKKAYDMEHTFYSNGEKKKIYMEIDPVTR 
TEI FRSGNGTDETLEVHDFKNGYTGI YFVGLQKCFIKTQI KVI P 
E FSEPEE E I DENEEI TTTFFEQS VI WVPAEKP I ENRDFLKNS KI 
LE ICDNVTMYW\ INPTL\ ISGTFAKQLHHNFAFI ILVSELQDFE 
EEGEDLHFPANEKKGI EQNEQWVVPQVKVEKTRHARQASEEEIiP 
IND YTENG I E FDPMLDERG YCC I YCRRGNR YCRRVCE PLLGYYP 
YPYCYQGGRVICRVIMPCNWWVARMLGRV 


6063 


71 


1079 


ETMAKNG PENCEDCH I LNAEAFKS KKICKSLKI CGLVFGI LALT " 
L I VLFWG S KH FW PEVP KKA YDMEHT FYS NGE KKKI YME I D PVTR 
TE I FRSGNGTDETLEVHDFKNGYTGI YFVGLQKCFIKTQI KVIP 
EFSEPEEEIDENEEITTTFFEQSVIWVPABKPIENRDFLKNSKI 
LB I CDNVTMY W\ INPTL\ ISGTFAKQLHHNFAFI ILVSELQDFE 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


A:nino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid/ F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EEGEDLHFPANEKKGIEQNEQWWPQVKVEKTRHARQASEEELP 
INDYTENGIEFDPMLDERGYCCIYCRRGNRYCRRVCEPLLGYYP 
YPYCYOGGRVICRVIMPCNW^ARMLGRV 


6064 


913 


311 


NLPQSJLPRPTEHS P PYSLEKMTDLVAVWDVALSDGVHKIEFEHG 
TTSGKR WYVDGKEE IRKEWMPKLVGKETFYVGAAKTKATINID 
AISGFAYEYTLEINGKSLKKYMEDRSKTTNTWVLHMDGENFRIV 
LE KDAMD VWCNGKKIiETAGE F VDDGTETHFS IGTH\ACYI KAV\ 
SSG\KRKEGIIHTLIVDNREIPE1AS 


6065 


1153 


641 


MS VR VARVAW VRGLGAS YRRGASS FPVP PPGAQGVAELLRDATG 
AEEEAPWAATERRMPGQCSVLLFPGGGSQWGMGRGLLNYPRVR 
ELYAAARRVLG YDLLELS LHG PQETLDRT VHCQ PAI FVASLAAV 
E KLHHLQPS VI ENCVAAAGFS VGE FAALVFAGAME FAEG 


6066 


68 


3470 


VKENM PATRKPMRYGHTEGHTE VC FDDSGS FI VTCG S DGDVR I W 
EDLDDDDP KF INVGEKAYSCALKSGKLVTAVSNNTIQVHTFPEG 
VPDGI LTRFTTNANHVVFNGDGTKI AAGSS D\ FLVKI VDVMDSS 
QQKTFRGHDAPVLSLSFDPKDIFLASASCDGSVRVWQISDQTCA 
I S WPI1I1QKCNDVINAKS I CRIAWQPKSGKLLAI PVEKSVKLYRR 
FS WSHQFDLSDNFISQTLNI VTWS PCGQYLAAGS INGLI I VWNV 
ETKDCMERVKHEKGYAICGLAWHPTCGRISYTDAEGNLGLLENV 
CDPSGXTSS S KVS SRVEKDYNDIiFDGDDMSNAGDFLNDWAVEIP 
SFSKG I INDDEDDEDLMMASGRPRQRSHILEDDENSVDISMLKT 
GS S LliKEEEEDGQEGS I HNLPtiVTSQR PFYDG PMPTPRQKP FQS 
GSTPLHIiTHRFMVWNS IGI ircyndeqdnai dvefhdts IHHAT 
HLSNTIiNYT I ADLSHE AI LLACESTDELASKLH CL.HFSS WDSS K 
EKI IDLPQNEDIEAICLGQGWAAAATSALLLRLFTIGGVQKEVF 
SLAGPWSMAGHGEQLFIVYHRGTGFDGDQCLGVQLLELGKKKK 
QILHGDPLPLTRKSYLAWIGFSAEGTPCYVDSEGIVRMLNRGLG 
NTWTPI CNTREHCKGKSDHYWWGIHENPQQLRCI PCKGSRFPP 
TLPRPAVAILSFKLPYCQIATEKGQMEEQFWRSVIFHNHLDYLA 
KNGYEYEESTKNQATKEQQELLMKNLALSCKLEREFRCVEKADL 
MTQNAVNLAIKYASRSRKIilLAQKLSELAVEKAAELTATQVEEE 
EEEEDFR KKIiNAG YSNTATE WSQPR FRJJQ VEEDAEDSGEADDE E 
KPEIHKPGQNSFSKSTNSSDVSAKSGAVTF5SQGRVNPFKVSAS 
S KEPAMSMNSARSTNILDNMGKSSKKSTALSRTTNNBKSPI I KP 
LI PKPKPXOASAAS YFQKRNSQTNKTEEVKEENLKNVLSETPAI 
CPPQNTENQRPKTGFQMWLEENRSN I I>SDN PDFSDEADI I KEGM 
IRFRVLS TEERKVWANKAKGETASEGTEAKKRKRWDESDE TEN 
QEEKAKENLNLSKKQKPLDFSTNQKLSAFAFKQE 


6067 


858 


321 


LPWQRI^VLLSRGKMAWGWLESU?TAQKTAIJ^DGRRKVHYI,F 
PDGKEMAEEYDEKTSELLVRKWRVKSALGAMGQWQLEVGDPAPI, 
GAGNLG PEXiI KESNANPI FMRKDTKMS FQWRI RNLPYPKDVYSV 
SVDQKERCIIVRTTNKKYYKKFSIPDLDRHQLPLDDALLSFA\T 
PTAP 


6068 


13 


1730 


GSKMADLANEEKPAIAPPVFVFQKDKGQKSPAEQKNLSDSGEEP 
RGEAEAPHHGTGHPESAGEHALEPPAPAGASASTPPPPAPEAQL 
PPFPREIiAGRSAGGSSPEGGEDSDREDGNYCPPVKRERTSSLTQ 
FPPSQSEERSSGFRLKPPTLIHGQAPSAGLPSQKPKEQQRSVLR 
PAVLQAPQPKALSQTVPSSGTNGVS LPADCTGAVPAAS PDTAAW 
RS P SEAADEVCALEEKE PQKNES SNASEE EACEKKDPATQQAFV 
FGQNLRDRVKLINESVDEADMENAGHPSADTPTATNYFLQYISS 
S LE N3 TNS AD AS SNKFVFGQN MS ER VL»S P P KLNEVS S DAN R EN A 
AAESGSESSSQEATPEKESLAESAAAY^TKATARKCLLEKVEVIT 
GEEAESNVLQMQCKLFVFDKTSQSWERGRGLLRLNDMASTDDG 
TLQSRLSDAGPRGSLR\ LILNTKLWAQMQ I DKAS EK\S I RITAM 
DNEDQG VKVFLISASS KDTGQVYAALHHR I LALRS RVEQEQEAK 
MPAPEPGAAPSNEEDDSDDDDVIiAPSaATAAGAGDEGDGQTTGS 
T 


6069 


583 


27 


PTRPGQAGSSSAMAAQRLGKRVLSICLQSPSRARGPGGSPGGIiQK 
RHARVTVKYDRRELQRRIiDVEKWIDGRLEELYRGMEADMPDEIN 
IDELLEIiESEEERSRKIQGLIiKSCX3KPVEDFIQELIAKLQGLHR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine / 
L=Leucine, M=Meth.ionine, N=Asparagine, 
P=Proline ( Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








Q \ PGLRQPS PS? \DGQPSAPFQG PGARTAS PLTLLALFPGP PER 
RPALLCVLSCI 


6070 


478 


858 


I RVTVDGEFLH Y I FPLQFLDS PEW/RFTETHRGRHF\QVTLTAE 

TDCRYVSWRRKKLYLL faqhry i srlfs vmgsd i adkl yalnd 

RVY I GKRYH YD I RIjPNFYQMSTPE I RRSPI/TQHFQNSRRYW 


6071 


2 


1654 


HEARTKGNMALARP \ VRLFSLVTRLLLAPRRGLTVRS PDEPLPV 
VRI PVALQRQLEQRQSRRRNLPRPVIiVRPGPLLVSARR PELNQP 
ARLTLGRWERAPLASQGWKSRRARRDHFS I ERAQQEAPAVRKIiS 
SKGSFADLGAWKPRVLHALQE\AAPEWQ\PTTVQSSTIPSLLR 
GRHWCAAETGSGKTLSYLLPLLQRLLGXHPSLDSLPIPAPRGI, 
VLVPSRELAQQVRAVAQPLGRSLGLLVRDIjEGGHGMRR I RLQLS 
RQPS AD VLVATPGALWKALKS RLI SLEQLSFLVLDEADTLLDES 
FLEIiVDYI LE KS H I AEGPADLEDP FNP KAQLVLVGATF PEGVGQ 
LLN K VAS PDAVTT I TSS KLHCI MPHVKQTFLRLKGADKVAEL VH 
ILKHRDRAERTGPSGTVLVFCNSSSTVNWIKSYILDDHKIQHIiRlj 
OGQMPAItMRVG I FQSFQKS SRDILLCTDI ASRGIiDSTGVELVVN 
YDFP PTLQDY IHRAGRVGRVGSEVPGTVISFVTHPWDVSIiVQKI 
ELAARRRRSLPGLASSVKEPLPQAT 


6072 


1 


742 


KMERTEMMPTINSQLEFKSKPFPLVSSSRWLVKRGELTAYVEDT 
VLFS RRTS KQQ VYF FLFND VL 1 1 T KKKS EES YNVND YSLRDQLL 
VESCDNEELNSSPGKNSSTMLYSRQSSASHLFTLTVLSNHANEK 
VEMLLGAETQSERARWITALGHSSGKPPADRTSLTQVEIVRSFT 
AKQPDELSIX3VADVVL»I\YQRVSDGWYEGER\ljRDGERGWFPME 
CAKEITCQATIDKNVBRMGRLLGLETNV 


6073 


620 


860 


PCRRGLARPLSRRPG/S ILVHCAVGVSRSATLVLAYLMIjYHHLT " 
LVEAIKKVKDHRGI IPNKGKJLRQLIALDRRLRQGLEA 


6074 


16B 


1110 


PGARCMATELQCPDSMPCHNQQVNSASTPSPEQLRPGDI.ILDHA 
GGNRASRAKVILLTGYAHSSLPAELDSGACGGSSIiNSEGNSGSG 
DSSS YDAPAGNS FLEDCELSRQ IGAQLKUjPMNDQI RELQTI IR 
DKTASRGDFMFSADRIilRLWEEGLNQLPYKECMVTTPTGYKYE 
GVKFEKGNCGVS I MRSGEAMEQGLRDCCRS IRIGKILIQSDEET 
QRAKVY YAKFPPD I YR RKVLLM YP I LQTG \ NTVT E AVK VTjI EHG 
VQPSVI H.LSLFSTPHGAKS 1 1 QEF PEI T I LTTE VH PVAPTHFG 
QKYFGTD 


6075 


3 20 


1091 


P?TCQPQEVEHH\YGYVPILGWKTLPSRCHQCVIVSSSSHLLGT ' 
KLGPE I ERAECT I RMNDAPTTG YS ADVGNKTTYR WAHS S VFRV 
I*RRPQEFVNRTPETVFIFWGPPSKMQKPQGSI,VRVIQRAGLVFP 
NT1EAYAVSPGRMRQFDDLFRGETGKDREKSHSWLSTGWFTMVIA 
VELCDHVHVYGMVP PN YCSQRPRLQRMP YHY YEP KGPDECVTYI 
QNEHSRKGNHHRFI TEKRVFSSWAQLYG ITFSHPSWT 


6076 


1721 


107 


HPS PTEAPR VQHLTMDCTWRI LF1#VAAATGTHAQVQI>VQSGAB V 
KKPGAS VKVS CKVSG YTI/TE LS MHWVRQAPGKGIiEWMGAFDPED 
GETI YAQKFOGR VTMTE DTS TDTAYMEIiS SUR3 EDTAVYYCATD 
HGDYAFDI WGQGTMVTVSSAPTKAPDVFP 1 1 SGCRHPKDNSP W 
LACIjITGYHPTSVXTVTWYMGTQSQAXQRTFPEIQRRDSYYMTS 
SQLSTPLQQWRQGEYKCWOHTASKSKKE I FRWPES PKAQASSV 
PTAQPQAEGSIiAKATTAPATTRNTGRGGEEKKKEKEKEEQEERE 
TKTPECPSH1X3PLGVYLLTPAVQDLWLRDKATFTCFVVGSDLKD 
AHLTWEVAGKVPTGGVE EGLLERHSNGS QSQHSR1/TL PRS L WNA 
GTSVTCTLNHPSIiPPQRLMAIJlEPAAQAPVKl*SIJNIjriASSDPPE 
A\ASWLLCEVSGFSPPNrLLMWLEDHGEVNTSGFAPARPL?KP\ 
RS TTFWA\ WS VLRVPAP PS PQPATYTCWS HEDSRTLLNASRSL 
EVSYVTDHGPMK 


6077 


3687 


1268 


LLPDMNLQPIFWIGLISSVCCVFAQTDENRCLKANAKSCGECIQ 
AGPNCGWCTNS T FLQEGMPTS ARCDDLEALKKKG CP PDD I ENPR 
GSKDI KKNKNVTttRS KGTAEKLKPEDITQIQPOXJLVLRLRSGEP 
QTFTLKFKRAEDYPI DLYYLM\DLSYSMKDDLENVKSLGTDLMN 
EMRR ITSDFR IGFGS FVEKTVMPYI STTPAKLRNPCTSEQKCTS 
PFS YKNVLS LTNKGE VFNELVGKQRI SGNLDS PEGGFDAI MQ VA 
VCGSLIGWRWVTRriLVFSTDAGFHFAGDGKLGGIVLPNDGQCHL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L= Leucine, M= Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Ssrine, T=Threonine, V=Valine, 
W=:Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ENNMYTMSHY YD YPS I AHLVQKLSENN1 QTI FAVTEEFQPVYKE 
LKNLIP KSAVGTLSANSSNVI QL I IDAYNSL.SSEV I LENGKLS E 
GVTIS YQS Y\ CKNGVNGTGENGRKCSNI SIGDEVQFEIS ITSNK 
CPKXDSDS FKIRPLG FTEEVEVI LQYI CECECQSEG I PESPKCH 
EGNGTFBCGACRCNEGRVGRHCECSTDEVHSEDIGCFTARKENQ 
FQKSASNHGRVPSAGQCVCRKRDNTNEIYSGKFCECDNFNCDRS 
NGL»I C3GNG VCKCRVCECNPNY TG S ACD CSLDTSTCE ASNGQ I C 
NGRG I CE CG VCKCTDP KFQGQTCEMCQTCLGVCAEH KE CVQCRA 
FNKGEKKDTCTQECSYFNITKVESRDKLPQPVQPDPVSHCKEKD 
VDDCWFYFT YS VNGNNE VMVHWENPECPTG PDI I P I VAGW AG 
I VL» I GtiALbL I WKLLM I IHDRRE FAKFEKEKMNAKWDTGENP I Y 
KSAVTTWNPKYEGK 


6078 


1426 


180 


ETEDVMELLEEDLTCPICCSLFDDPRVLPCSHNFCKKCLEGIIjE 
GSVRNSLWRP VPFKCPTCRKKTFS YWEL I PIjQVNYSLKGI VEKY 
NKX KISPKMPVCKGH\ LGQPLNI F\ Ch \ TDMQLDIi/CG I C\ATR 
GEHTKH VFCS I EDAYAQERDAFESLFQS FETWRRGDALSRLDTL 
ETS KRKStiQI»IiTKDSr>KVKEFFEKLQHTLDQKKNEIIiS DFETMK 
LAVMQAYDPE INKLNTI DQEQRMAF^I AEAFKDVSE PI VFLQQM 
QEFREKIKVIKETPLPPSNLPASPLMKNFDTSQWEDIKLVDVDK 
LSLPQDTGTFISKIPWSFYKLFLLrLLLGLVIVFGPTMFLEMSL 
FDDLATWKGCLSNFSS YLTKTADFI EQSVFYWEQVTDGFF I FNE 
RFKNFTLWLNUVAEFVCKYKLL 


6079 


1586 


141 


ATARDLGCARRIDRWMESTPSRGLNRVHLQCRNLQEFIiGGLSP 
GVLDRLYGHPATCLAVFREIjPSI»AKNWVMRMLFLEQPLPQAAVA 
LWVKKE FSKAQEESTGLLSGLR I WHTQLLPGGLQGLIIiNPI FRQ 
NLRIALLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEWL 
HFMVGSPSAAVSQDLAQLLSQAGLMKSTEPGEPPCITSAGFQFL 
LLDTPAQLWYFMLQYLQTAQSRGMDLVEILSFLFQLSFSTLGKD 
YSVEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYPT/RALAINL 
SSGVSGAGGTVHQPGFIV\VETNYRLYAYTESELQIALIAIjFSE 
MLYPFP\NMW\ARVTR\ESVQQAIASGITAQQIIHFLRTRAHP 
VMI,KQTPVIjPPTITDQ 1 RLWELERDRIiRFTEGVLYNQFLSQVDF 
ELL \ LAHAPKLGVLVFE /NTPAKRLM WTPAGH SD VKR FWKRQK 
HSS 


6080 


1 


1199 


IET I DHVGEFAMAAQAAGVSRQRAATQGLGSNQNALKYLGQDFK 
TLRQQCLDSGVLFKDPEFPACPSALGYXDLGPGSPQTQG I IWKR 
PTELCPSPQFIVGGATRTDICO/3GLGDCWIjIiAAIASI»TIjNEELiL 
YRWPRBQDFQENYAGI FHFQPLCP PS? \FWQYGEWVE WIDDR 
L»PTKNGQLLFIiHSEQGNE FWS AIiLE KAYAKLNGC YEAXiAGGSTV 
EGFEDFTGGI SEFYDLKKPPANLi yqi I r kalcags LLGCS idvy 
SAAEAEA I TSQ KLVKSHA YS VTG VEE VNFQGHPEKL I RLRNPWG 
EVEKSGAWSDDAPEWNHIDPRRKEELDKKVEDGEFWMSLSDFVR 
QFSRLE I CNLS PDSLSSEE VHKWNLVLFNGHWTRGSTAGGCQNY 
PGSS 


6081 


3 


865 


EMLPLLLPLPLLWA/GAIAQDARFRLEMPESVTVQEGIiCIFVHC 
S VFYLE YG WKDS TPAYGHWFREGVS VDQETP VATNNSTQKVQKE 
TQGRFHLLGDPSRNNCSLS IRDARRRDNGS YFFWVARGRTKFS Y 
KYS PLS VYVTALTH RPDILIPE FLKSGHPSNLTCS VP WVCEQGT 
PPIFSWMSAAPTSI^PRTUiSSVLTIIPRPQDHGTOLICQVTFP 
GAGVTTERT IQLS VS WKSGTVEE VVVLAVGVVAVKI LLLCLCL I 
J. Lib c ti JvjvaA V tu\ V c, V CtlSN V Y A V Mvjy 


6082 


283 


1288 


EARSPG PTQTRTAPGLAAPGLAQ PAAIiRLLLSRPPS AAMDGDGD 
PES VGQP EEAS PE EQPEEASAEEER PEDQQE EEAAAAA\ Y\ LDE 
L PEPLLA/ LRVLiAALPRH E \LVQACR \LVCLRWKELVDGAPLWL 
LKCQQEG LVPEGG VEE ERDH W QQFY FI>S KRRRNLLRWP CGEEDL 
EGWCDVEHGGDGWRVEEI/PGDSGVEFTHDESVKKYFASSFEWCR 
KAQVTDI*QAEGYWEELLDTTQPAIWKDWYSGRSDAGCL»YEIiTV 
KLLSEHENVLAEFSSGQVAVPQDSIX3GGWME ISHTFTDYGPGVR 
FWFEHGGQDSVYWKGWFGARVTNSSVWVEP 


6083 


1865 


309 


KQWCAERRGLGMSLADELLADLEEAAEEEEGGSYGEEEEEPAIE 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
■ amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
<A=Alanine, ^Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6084 






UVQEET(JI,DLSGDSVKTIAKLWDSKMFAKlMMKiHKYISKQAXA~ 
S E VMG P VE AAPE YRVI VDANN LTVE I ENELNI I H KF I RDK YS KR 

FPELESLVPNALDYIRTVKELGNSLDKCKNNENLQQILTNATIM 
VVSVTASTTQGQQLSEEELERLEEACDMALELNASKHRIYEYVE 
SRMSFIAPNLSIIIGASTAAKIMGVAGGLTNLSKMPACNIMLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL 
RRKAARLVAAKCTIAARVDSFHESTEGKVGYELKDEIERKFDKW 
QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIR\KQ 
ANRMSFGEIEEDAYQEDLGFSLGHLGKSGSGRVRQTQVNEATKA 
RISKTLQRTLQKQSWYGGKSTIRDRSSGTASSVAFTPLQGLEI 
VNPQAAEKKVAEANQKYFSSMAEFLKVKGEKSGLMST 


6085 


1865 


309 


KQWC^ul^RGI^MSlJ^ELIiADI^EAAEEEEGGSYGEEEEEPATE - 
DVQEETQLDLSGDSVKTIAKLWDSKMFAEIMMKIEEYISKQAKA 
S E VMG P VEAAPE YR V I VDANNLTVE I ENELNT IHK F I RDK YS KR 
FPELESL VPNALDY IRT VKEIiGNSLDKCKNNENLQQ I IiTNATI M 
WSVTASTTQGQQLSEEELERLEEACDMALELNASKHRIYEYVE 
S RMS F I APNTLS 1 1 IGASTAAKI MGVAGGI>TNLS KMPACNI MLliG 

aqrktlsgfsstsvlphtgyiyhsdivqslppipppfsvap\di, 

RRKAARIjVAAKCTLAARVDS FHESTEGKVG YELKDEI ERKFDKW 

qepppvkqvkplpapldgqrkkrggrryrkmkerlgltetr\ K q 

ANRMS FGE I EEDAYQEDI»G FSLGHLGKSGSGRVRQTQVNEATKA 
RISKTLQRTLQKQSWYGGKSTIRDRSSGTASSVAFTPLQGLEI 

vnpqaaekkvaeanqkyfssmaeflkvkgeksglmst 


6086 


2 


1456 


sgprsfo^nravgrisi^gkrnpevtllpgvsservrrwrrarv 
gvarvkpgnpwkpspatqvpr/vpaqvylpgrgpplregeelvm 
deeayvi,ykraqtgapci»sfdivrdhlgdnrtelpltlylcagt 
qaesaqsnrlmmlrmhnlhgtkpppsegsdeeeeeedeedeeer 
kpqlelamvphygginrvrvswlgeepvagvwsekgqvevfalr 
rllqweepqalaaflrdeqaqmkpifsfaghmgegfaldwspr 

VTGRLbTGDCQKNIHLWTPTDGGSWHVDQRPFVGHTRSVEDLQW 
SPTENTVFASCSADASIRIWDIRAAPSKACMLTTATAHDGDVW 

iswsrrepfllsggddgalkiwdlrqfksgspvatfkqhvapvt 

S VEWHPQDSG V FAASG ADHQ I TQWDLG / 1 VERDP EAGDVEAD *>G 

IADLPQQLLFVHQGETELKELHWHPQCPGLLVSTALSGFTIFRT 
ISV 


6087 


2419 


1357 


GAATQHGGAMNIjIiPCNPHGNG L L YAG FNQDHGCFACG.MEKGFR V 
YNTDPLKEKEKQEFLEGGVGHVEMLFRCNYLALVGGGKKPKYPP 
NK\^IWDDUCKKTVIEIEFSTEVKAVKLRR\DKIVVVLDSMIKV 

ftfthnp\hqlhvfe\tcynpkglcvlcpnsnnsllafpgthtg 

HVQLVDLASTEKPPVDIPAHEGVLSCIALNLQGTRIATASEKGT 

LIRIFDTSSGHLIQELRRGSQAANIYCINFNQDASLICVSSDHG 

TVHIFAAEDPKRNKQSStiASASFLPKYFSSKWSFSKFQVPSGSP 

CICAFGTEPNAVIAICADGSYYKFLFNPKGECIRDVYAQFIiEMT 
DDKIi 


6088 


476 


1877 


ONSQRTGLPlTIFSRSFPLLTGSDLCENMPLTCrWRNWRQWIRP*- 
LVAVI YLVS I VVAVPLCVWE LQKLEVG I HTKAW F I AG I FLiLLT I 

PISLWVILQHLVHYTQPEIOKPIIRILWMVPIYSLDSWIALKYP 

GIAIYVDTCRECYEAYVIYNFNGFLTNYLTNRYPNLVLILEAKD 

QQKHFPPLCCCPPWAMGEV^LFRCKLGVLQYTVVRPFTTIVALI 

CELLGIYDEGNFSFSNAWTYLVI INNMSQLFAMYCLLLFYKVLK 

EELSPIQPVGKFLCVKLWFVSFWQAWIALLVKVGVISEKHTM 
EWOTVEAVATGLnriFT TPTPMin a att\ \ uuvud^wnvi 

VA »on»rtioi^i;r ah — i. ant ljtAAJ,A\HHYTFSYKPYVQEAEE 

SSCFDSFLAMWDVSDIRDDISEQVRHVGRTVRGHPRKKLFPEDQ 
DQNEHTSLLSSSSQDAISIASSMPPSPMGHYQGFGHTVTPQTTP 
ITAKI S DE ILSDTIGE KKEPS D KS VDS 




1684 


689 ( 
] 
J 

i 


jASGDVRLLQQGHRCLI^VAPIOiVPPVRGVKKGFRAAFRFQKE"" 

[,ERQRLLRCPPPPVRRSEKPNWDYHAEIQAFGHRLQENFSLDLL 

<TAF VNS C YIKS EEAKRQQLG lEKEAVIJ^NLKSNQELSEQGTS F 

5QTCLTQFLEDEYPDMPTEGIKNLVDFLTGEEVVCHVARNLAVE 

3LTLSEEFPVPPAVLQQTFFAVIGALLQSSGPERTALFIRDFI,I 

[?QMTGKELFEMWKIINPMGI^VEELKKRNVSAPESRLTRQSG\A 
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NO: 
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beginning 
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corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A&Alanine C=Cvsheiri£» D-lcnarhi a^-?/-? ir_ 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Hictidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S -Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Un)cnown r *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PTALPLYFVGLYCDKKLIAEGPGETVIATAEEEAARVALRKJ^YGF 
TENRRPWNYSKPKETLRAEKSITAS 


6089 


3 


3054 


TRIjGI PGS T I SS RPRLCAliAAEGHFIA3H S WTGSRAGAHTGAPAW 
PSRRLRDLPAGGMWRLRRAAVACEVCQSLVKHSSGIKGSLPLQK 
LHLVSRSIYHSHHPTLKLQRPQLRTSFQQFSSLTNLPLRKLKFS 
P I KYG YQ PRRNFW PARIiATRLLKLR YL I LGS AVGGG YTAXKTFD 
QWKDMIPDLSEYKWIVPDIVWEIDEYIDFEKIRKALPSSEDLVK 
LAPDFDKIVESLSLLKDFFTSGSPEETAFRATDRGSESDKHFRK 
VSDKEKIDQLQEELLHTQLKYQRILERLEKENKELRKLVIiQKDD 
KGIPFIESIiRKSLIDMYSEVXiDVIjSDYDASYWTQDHIjPRWVVG 
DQSAGKTSVLEMIAQAR I FPRGSGEMMTRS PVKVTLSEGPHHVA 
LFKDS SRE FDLTKE EDLAALRHE I ELRMRKNVKEGCTVS PETIS 
LN VKGPGLQRMVLVDLPGVINTVTSGMAPDTKETI FS I SKAYMQ 
DPNAI ILCIQDGSVDAERS I VTDLVSQMDPHGRRTIFVLTKVDL 
AEKNVASPSRIQQI I EG KLFPMKALG YFAWTG KGNS S ES IEAI 
ice i jc. c. tt r k yjM £> ivLjIjK I oPlLtKAnQVTTRNIjSIiAVSDCFWKMVRES 
VEQQADSFKATRFNLETEWKNKYPRLRELDRNELFEKAKNEIIjD 
EVISLSQVTP KHWEEI LQQSLWBRVSTHVIENI YbPAAQTMNSG 
TFNTTVDI KLXQWTDKQLPNTKAVEVAWETLQEEFSRFMTE PKGK 
EHDDI FDKLKEAVKEES I KRHKWNDFAEDSLRVI QHNALEDRS I 
SDKQQWDAAI YFMEEALQARLKDTENAI ENMVGPD\W KKRWL. YW 
KNRTQEQCVHNETKNELEKMLKCNEEHPAYLAS DE I TTVRKNUE 
SRGVEVDPSLI KDTWHQVYRRHFLKTAI*NHCNLCRRGFYYYQRH 
FVDSELECNDVVLFWRIQRMLAITAKTLiRQQLTNTEVRRLEKNV 
KEVLEDFAEDGEKKI KLLTGKRVQliAEDLKKVREIQEKLDAFXE 
ALHQEK 


6090 


194 


1560 


pvfvpapgavleqas / as p platqtwp lqhcki pejl»pvqas i l 
FEIjQIjFFCQLIA1iFVHYINIYKTVWWYPPSHPPSHTSLNFHI*ID 
FNLLMVTTI VLGRRFIGS I VKEASQRGKVSLFRS I LLFLTRFTV 
LTATG WS LCRS L IHLFRT YS FtiNLI»/ FPZjLSVWDVHS VPAAEL.R 
P\RKTSLFNHMASMGPREAVSGLAKSRDYIiLTLR\RRGSSTQDS 
CMARTPCP/PIIACCLSPSLIRSEVEFLKMDFNWRMKEVLVSSI4L 
SAYYVAFVPVWFVKNTHYYDKRWSCELFLIiVSISTSVIDMQHLL 
PASYCDLI*HKAAAHLGCWQKVDPALCSNVLQHPWTEECMWPQGV 
liVKKS KNVYKAVGH YNVAI PSDVSH FRFH FFFS KPLRI LNI IiLL 
LEGAVlVYQLYSLMSSEKWHQTISLALILFSNYYAFFKIiRDRL 
VLGKAYS YSASPQRDLDHRFS 


6091 


3279 


412 


S SRTREMEEKE I IiRRQIRLLQGI*! DD YKTLHGNAPAPGTPAASG 
WQPPTYHSGRAFSARYPRPSRRGYSSHHGPSWRKKYSLVNRPPG 
PSDFPADHAVRPLHGARGGQPPVPQQHVLERQVQLSQGQNWIK 
VKPPSKSGSASASGAQRGSLEEFEDTPWSDQRPREGEGEPPRGQ 
LQPSRP TRARGTCS VEDPLI*VCQKE PGK P RM VKS VGSVGDS PRE 
PRRTVSESVIAVKASFPSSALPPRTGVALGRKLGSHSVASCAPQ 
IiLGDRRVDAGHTDQPVPSGSVGGPARPASGPRQAREAS LWTCR 
TNKFRKWNYKWVAAS S KS PRVARRALS PRVAAENVCKAS AGMAN 
KVEKPQLIADPEPKPRKPATSSKPGSAPSKYKWKASSPSASSSS 
S FRWQ S EAGSKDHASQIjS PVLSRS PSGD \ RPAVGHSGIiKPLSGE 
T P LS AY KVKSRTKI I RRRGSTSLPGDKKSGTS PAATAKSHLS I*R 

SSLHAVRTAPTSKVI KTRYR I VKKTPAS PLSAPPFPLSLPSWRA 
RRLSLSRS L VLNRLR P VASGGG KAQ PGS P WWRS KGYRC IGGVLY 
KVSANKLSKTSGQPSDAGSRPLLRTGRLDPAGSCSRSIASRAVQ 
RSLAI IRQARQRREKRKSYCMYYNRFGRCNRGERCPY IHDPEKV 
AVCTRFVRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGICSNSN 
CPYSHVYVSRKAEVCSDFI.KGYCPLGAKCKKKHTKLCPDFARRG 
ACPRGAQCQLLHRTQKRHSRRAATSPAPGPSDATARSRVSASHG 
P RXPSAS QR PTRQTPS S AALTAAA VAAP PHCPGGS AS P S S S KAS 
SSSSSSSS P PASLDHEAPSLQEAALAAACSNRLCKLPS FI SLQS 
S PSPGAQPRVRAPRAPLTKDSGKPLHIKPRL 


6092 


143 


3190 


AKAPPTGESSEPEAKVIJITKRLYRAWEAVHRLDLILCNKTAYQ 
EVFKPENISLRNKIiRELCVKI^FLHPVDYGRKAEEbLWRKVYYE 
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amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E~ 
Glutamic Acid, F- Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=s Leucine, M=Methionine , N-Asparagine , 
P=Proline, Q^Glut amine, R-Arginine, 
SsSerine, T= Threonine, V=Valine, 
W=Tryptophan, Y -Tyro sine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








V 1QLI KTNKKH I HSRSTLECAYRTHLVAG 1G FYQHLLL Y IQSHY 
QLELQCCIDWTHVTDPLIGCKKPVSASGKBMDWAQMACHRCLVY 
LGDI>S R YQNELAG VDTELLAE R F Y YQALS VAPQ IGMP FNQLGTL 
AGSKYYNVEAMYCYLRCIQSEVSFEGAYGNLKRLYDKAAKMYHQ 
LKKCETRKL.SPGKKRCKDIKRI>LVNFMYLQSI.IiQPKSSSVDSEL 
TSLCQS VLEDFNLCLF YLPSS Pl^IiS IiASEDEEEYESGYAFLPDL 
LlFQMVIICLMCVHSLBRAGSKQYSAAIAFTIiALFSHLVNHVNI 
RLQAELEEGEN P VP AFQSDGTDEPE S KE PVEKEE E PDPE PPP VT 
PQVGEGRKSRKFSRLSCliRRRRHPPKVGDDSDIiSEGFESDSSHD 
SARAS EGSDSGSDKSLEGGGTAFDAETDSEMNSQESRSDLEDME 
EEEGTRSPTLEPPRGRSEAPDSLNGPLGPSEASIASNIiQAMSTQ 
M FQTKR CFRIiAPT FSNLIiLQPTTNPHTS ASHR PCVNGDVDKPSE 
PASEEGSESEGSESSGRSCRNERSXQEKLQVLMAEGLLPAVKVF 
LDWLRTNPDLI IVCAQSSQSLWNRLSVLLNLLPAAGELQESGLA 
LCP E VQDLLEGCEL PDLPS S LLLPEDMALRNL P PLRAAHRR FNF 
DTDRPLLSTLEESWRICCIRSFGHFIARLQGSILQFNPEVGIF 
VS I AQS EQESLLQQAQAQFRMAQEEARRNRLMRDMAQLRLQLEV 
SQLEGSliQQPKAQSAMSPYLVPDTQAIiCHHLPVlRQIiATSGRFI 
VIIPRTVIDGLDLIiKKBHPGARDGIRYLEAEFKKGNRYIRCQKE 
VGKSFERHKLKRQDADAWTLYKI LDSCKQLT \ LAQGAGEEDPSG 
M VT 1 1 TGLPLDNPSLLSGPMQAALQAAAHAS VDI KNVLDFYKQW 
KEIG 


6093 


76 


1002 


ACGRRAMLALRVART/ SRWGAL \RGAVWAPGTRPS KRRACWALL 
P P VPCCLG CIAERWRLRPAAIiGIiRL PG IGQRNHCS GAG KAAPR\ 
PAAGAGAAAEAPGGQWGPASTPSLYENPWTIPNMLSMTRIGLAP 
VLGYLIIEEDFNIALGVFALAGLTDLLDGFIARNWANQRSALGS 
ALDPLADKI ItlSI IiYVSLTYADIiI PVPLTYM 1 1 SRDVMJj IAAVF 
YVRYRTLPTPRTLAKYFNPCYATARLKPTFISKVNTAVQLILVA 
ASLAAPVFNYADS I YLQILWCFTAFTTAASAY SYYHYGRKTVQV 
IKD 


6094 


23 


1010 


PFLRCLRGDQKAKMSERKVLNKYYPPBFDPSKIPKLKLPKDRQY 
WRLMAPFNMRCKTCGEYIYKGKKFNARKETVQNEVYLiGLPIFR 
FYIKCTRCliAEITFKTDPENTDYTMEHGATRNFQAEKL.LEEEEK 
RVQKEREDEELNNPMKVLENRTKDSKLEMEVLENLQEIiKDLNQR 
QAHVDFEAMLRQHRDSEEERRRQQQEEDEQETAALLEEARKRRL 
LEDSDS EDEAAPS PLQPALRPNPTAIIiDEAPKPKRKVEVWEQS V 
GSLGSRPPLSRLWVKKAKADPDCSNGQPQA/APHPRSPAEQEG 
GQP YTPDAWRVLPEPTGCI PGQ 


6095 


1 


1599 


TRGRAAERSRGRGHGFLGGGFA\SWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYQDLIDRGWRRSGKYVYKPVMNQTCCPQ 
YTIRCRPLQFQPSKSHKKVLKKMLKFLAKGEVPKGSCE\DEPMD 
STMDDAVAGDFALINKLDIQCDLKTIiSDDIKESLESEGKNSKKE 
E PQELLQSQDF VGE KLG S GEPSHS 



TRADOCS: 1 4 ! 6257. 1 <%CSHOI1.DOC) 



445 



BNSDOCID: <WO 0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 
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cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A- Alanine, C=Cysteine, D=sAspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine F M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKVHTVPKPGKGADLS KP PCRKAKE I RKERKRLKLMQQNPAGEl* 
EGFQAQGHPPSLFPPKAKSNQPKSLEDLIFESLPENASHKLEVR 
WRSSPPSSQFKATXiLESYQVYKRYQMVIHKNPPDTPTESQFTR 
FI»CS S PLEAETP PNG PDCG YGS FHQQYWLDGKI I AVGVI DI LPN 
CVSSVYLYYDPDYSFLSLGVYSALREIAFTRQLHEKTSQIjSYYY 
MG FY IHSCPKMKYKGQYRPSDLLCPETYVWVP I EQCI»PSLENS K 
YCRFNQDPEAVDEDRSTEPDRLQVFHKRA IM PYGVYKKQQKDPS 
EEAAVLQYASLVGQKCSERMLLFRN 


6096 


2277 


575 


QRVRAAX»LSSAMEDSEALGFEHMG1jDPRLLQAVTDLGWSRPTIjI 

qekai plialegkdllarartgsgktaayaipmlqlllhrkatgp 
v^eqavrglvlvptkblarqaqsmiqqlatycardvrvanvsaa 
edsvsqravlmekpdvwgtpsrilshlqqdsiiklrdslellw 
deadlilfsfgfeeelksiilchijpriyqaflmsatfnedvqalke 
ii i lhn p vtlikiiqesqlpgpdqlqq fq wceteedkflll yai>lk 
ls li rgksllfvntlers yrbrlfleqfsiptcvlngelplrsr 

CHI I SQFNQGFYDCVI ATDAEVLGAPVKGKRRGRGPKGDKASDP 
EAGVARGIDFHHVSAVLNFDLPPTPEAYIHRAGRTARANNPGIV 
LTFVLPTEQFHLGKIEELLSGENRGP I LLPYQFRMEE I EGFRYR 
CRDAMRSVTKQAIREARLKEIKEELLHSEKLKTYFEDNPR\DIiQ 
LLRHDL PIjH PAVVKPHLGH VPDYLVP PALRGLVRPHKK\GRSCI> 
PLVGRPREQSPRTHCAASSTKERKSDPQPSPPEWGPLWS 


6097 


1573 


192 


APGTMSGGKKKSSFQITSVTTDYEGPGSPGASDPPTPQPPTGPP 
PRL.PNGEPSPDPGGKGTPRNGSPPPGAPSSRFRWKLPHGLGEP 
YRRGRWTCVDVYERDLEPHSFGGLLEG I RGASGGAGGRSLDSRL 
ELASLGLGAPTPPSGLSQGPTSWLRPPPTSPGPQARSFTGGLGQ 
LWPSKAKAEKPPI»SASSPQQRPPEPETGESAGTSRAATPltPSI* 
RVEAEAGGSGARTPPLSRRKAVDMRIiR14ELGAPEEMGQVPPLDS 
RPSSPALYPTHDASLVHKSPDPFGAVAAQKFSLAHSMLAISGHL 
DSDDDSGSGSLVG I DNK I EQAMDLVKSHLMFAVREE VEVLKEQI 
RELAERNAALEQEWGLIiRAliAXSPEQLGSAGPPRGVPRXLGPPA 
PNGPFV1>SLPSLTIVPLGLPGIASAAWPPLPMPALIVPVFPGVG 
VQALSNGPWS PGPLPHLL 1 1 PSLDGGGEGFRTGRQQGAP PGEET 
QPPPSLPGTPQQ 


6098 


168 


1074 


NYCjLRHRS PLEKDSS PGSS S TS LLIKKQRETSDTP I MRAIiKELD 
EGKIFKNWGTQTEKEDTSNINPRQTETSVNASRSPEKCAQQRQK 
RIiNSASQRSSSLPPSNRKSSTPTKREIMLTPVTVAYSPKRSPKE 
NLSPGFSHLLSKNESS PIRFDI LLDDLDTVPVSTLQRTNPRKQL 
\QFIiPLDDSE ek\t YSEKAT \DN I VNHSSCPBP VPNGVKKVSVR 
TAWEKNKSVSYEQCKPVSVTPQGNDFBYTAKIRTLAETERFF\D 
ELTKEKDQIEAALSRMPSPGGR ITLQTRLNQEAFGRS FGKD 


6099 


168 


1074 


N YCLRHRS PLEKDSS PGSSSTSLIj I KKQRETS DTP I MRALKELD 
EG K I FKNWGTQTEKEDTSNINPRQTETS VNAS RS PEKCAQQRQK 
RLNSASQRSSSLPPSNRKSSTPTKREIMLTPV1*VAYSPKRSPKE 

MT.crpfiTrQHT.T.QTrMTrQQDTPI^riTT.T^r^T.ri'l'V/ DVGTTriTJTWDDVAT 
JWiJO tr\JV oxlXjlJOAiNCtSo fJL KrUi XjJ_i_JL>J_tJ_/ 1 Vr vol liyKlWrKrtV'" 

\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
TAW2KNKSVSYEQCKPVSVTPQGNDFEYTAKIRTIAETERFF\D 
ELTKEKDQ I EAALSRMPS PGGRITLQTRLNQEAFGRS FGKD 


6100 


2 


713 


FVEV3G YRSRADPEPRGRDTMTYAYLFKY 1 1 IGDTGVGKSCLIiI. 
QFTD KRFQPVHDLTI GVE FGARMVN I DG KQ I KLQI WDTAGQES F 
RS I TRS Y YRGAAGALL VYDITRRETFNHLTS WLEDARQHS S SNM 
VIMLIGWKSDLESRRDVKREEGEAFARE\HGLIFMETSAKTACN 
VEEAF INTAKE I YRKI QQGLFDVHNE ANG I KIGPQQS 1 STS VGP 
SAS QRNSRD IGSNSGCC 


6101 


1 


1399 


FRGRAWPLREVSHWLGCRRVCSWSASWGRX.PALSARLSPLLAFR 
GKMVFPLSCAVQQYAWGKMGSNSEVARIiLASSDPLAQIAEDKPY 
AELWMGTHPRGDAKILDNRISQKTLSQWIAENQDSLGSKVKDTF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine # ^Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KGNLPFLFKVl>SVETPIiSIGAHPNKELAEKLHjLQAPQHYPDANH 
KPEMAI ALTP FQGLCG FRPVEEI VTFLKKVPE FQFL IGDEAATH 
LKQTMSHDSQAVASSLQSCFSHLMKSEKKWVEQLNLLVKRISQ 
QAAAGNNMEDIFGELLLQLHQQYPGDlGCFAIYFLNIiLTLKPGE 
AMFLEAOTPHAYLKGDCVECMACSDNTVRAGLTPKFIDVPTLCE 
MLSYTPSSSKDRLFLPTRSQEDPYLSIYDPPVPDFTIMKA\EVP 
G \S VTE YKD1ALDSAS I LbMVQGTVT ASTPTTQTP I PLQRGGVL 
F IGANES VS LKLTE PKDLLI FRACCLL 


| 6102 


70 


2415 


QT PQATLAANG AE DS RGG EML PAG 5 1 GAS PAAPCCSESGDERKN ' 

LEEKSDINVTVLIGS KQ VSEGTDNGDLPS YVSAFI E KEVGNDLK 

SLKKLDKLIEQRTVSKMQLEEQVLTISSEIPKRIRSAIjKNAEES 

KQFLNQFLEQETHLFSAINSHLLTAQPWMDDLGTMISQIEEIER 

HLAYLKWISQIEELSDNIQQYLMTNMVPEAASTLVSMAELDIKli 

vc» 1 «!_i1Aj r v K.t WHK 1 LKDKJbTS DF EE I LAQ LHWPF I A 

PPQSQTVGLSRPASAPE I YSYLETLFCQLLKL,QTSHELLTEPK\ 

HSQKNTLFLPPLLSS/WPIQVMLTPLQKRFRYHFRGNRQTNVLS 

KPEWYlAQVLMWIGNHTEFLDEKIQPILDKVGSLVNARLEFSRG 

LMMLVLEKLATDIPCLLYDDNLFCHLVDEVLLFERELHSVHGYP 

GTFASCMHILSEETCFQRWLTVERKFALQKMDSMLSSEAAWVSQ 

Y KD I TDVDEMKVPDCAETFMTIiL.LV I TDR YKNLPTASR KLQ FLE 

LQKDLVDDFR IRLTQ VMKEETRASLG FR YCAI LNAVNY I S TVLA 

DWADNVFFLQLQQAALEVFAENNTLS KLQLGQLASMES S VFDDM 

INLLERLKHDMLiTRQVDHVFREVKDAAKLYKKERWLSLPSQSEQ 

AVNSLSSSACPLLLTLRDHLLQLEQQLCFSLEKIFWQMLVEKLD 

VYIYQEIILANHFNEGGAAQU2FDMTRNLFPLFSHYCKRPENYF 

KHIKEACIVLNLNVGSALTAGKDVLPVQLQGSFPAT 


6103 


207 


2523 


E£NSTMTTYIjEFIQQNEERDGVRFSWNVWPSSR1,EATRMVVPVA 
ALFTPLKERPDIiPPIQYEPVLCSRTTCRAVLNPLtCQVDYRAKLW 
ACNFCYQRNQFPPSYAGI S ELNQPAELLPQFS S I EY WLRGPQM 
PLI FLYVVDTCMEDEDLQALKESNQMS LSLLPPTALVGL I TFGR 
MVQVHELGCEG I S KS YVFRGTKDLS AKQLQEMLGLS KVP VTQAT 
RGPQVQOPPPSNRFLOPVOKTDMNr/rnT T/2PT npTTDWotrDrvvn 
PLRSSGVALS I AVGLLECTFPNTGAR IMMFIGG PATQGPGM WG 
DELKTPIRSWHDIDKDNAKYVKKGTKHFEALANRAATTGHVIDI 
YACALDQTGLLEMKCC PNLTGGYMVMGDS FNTS LFKQTFQR VFT 
KDMHGQFKMGFGGTLEIKTPR\E I KI SGAIGPCVSLNSKGP CVS 
ENE I GTGGTCQ WKI CGLS PTTTLA I YFEWNQHNAP I PQGG\ RG 
A\ IQFVTQY\QHSSGQRRIRVTT1 ARN\ WADAQTQIQNI AAS FD 
QEAAA ILMARLAI YRAETEEGPD VLRWLDRQL IRLCQKFGE YHK 
DDPSSFRFSETFSLYPQFMFHL.RRSS FLQVFNNS PDESS YYRHH 
FMRQDLTQSLI MI QPI LYAYS FSGP PE P VLLDS S S I LAD R I LLM 
DTFFQIL I YHGET I AQWRKSG YQDMPE YENFRHLLQAPVDDAQE 
ILHSRFPMPRYIDTEHGGSQARFLLSKVNPSQTHNNMYAWGQES 
GAP I LTDD VSLQVFMDHL KKLAVSS AA 


DJLU4 


124 


732 


KVSEYI I LSKDKILFHAIiAMLVLWSPWSAARGVLRNYWERLLR 
FCLPQSRPGFPSPPWGPALAVQ\AQPC1.QSQQMIPVEVKRI/RSL 
LDS I FWt4AAPKNRRTIEVNRCRRRNPQKLI KVKNNI DVCPECGH 
LKQKHVLCAYCYEKVCKETAE I RRQ IG KQ EGG PFKAPTI BTWL 
YTGETPSEQDQGKRI IERDRKRPSWFTQN 


6105 


3 


989 


PLHGACTSL.VLQRFCHRRPRPCAPARPEDMRRPAAVPLLLLLCF 
GSQRAKAATACGR PRMLNR M VGGQDTQEGEWP WQVS IQRNGS HF 
CGGSLIAEQWVLTAAHCFRNTSETSLYQVLLGARQLVQPGPHAM 
YARVRQVESNPLYQGTASSADVADVELEAPVPFTNYILPVCLPD 
PSVTFETGMNCWVTGWGSPSEEDLLPEPRILQKLAVPIIDT\PR 
CNLLYSKDTEFGYQPKTIKNDMLCAGFEEGKKDACKGDSAGPLV 
CLVGQSWLQAGVI S WGEGCARQNRPGVYIRVTAHHNWIHRI I PK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, b=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine» 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine . 
P=Proline, Q=Glutamine. R-Arginine, 
S=Serir.e, Ts= Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQVQPSEVGRPEVTPPGPGAP 


6106 


3 


1302 

i 

» 


GRPPTAPHTGRPPTANRGDPRLDLKRGCARIiLTSIESRGRPAAS 
AGIiRRDRCALRRWPLRRAPIiARATRRRAG S PRRCAPRPRACPQG 
WSRARHCPGGLCLLI#L1>LCQFMEDRSAQAGNCWLRQAKNGRCQV 
LYKTELSKEECCSTGRLSTSWTEEDVNDNTLFKWMIFNGGAPNC 
IPCKETCENVDCGPGKKCRMNKKNKPRCVCAPDCSNITWKGPVC 
GLDGKTYRNECALLKARCKEQPELEVQYOGRCKKTCRDVFCPGS 
STCV\ VDQTNNAYCVTCNR I CPE P AS SEQYIjCGNDGVTY S \S AC 
HLRKATCLLGRS I GLAYEGKC I KAKS CEDIQCTGGKKCLWDFKV 
GRGRCSLCDELCPDSKSDEPVCASDNATYASECAMKEAACSSGV 
LLEVKHSGSCNS ISEDTEEEEEDEDQDYSPPI SSILEW 


6107 


623 


i^a 


S RCS S PR PE PGRGRG K / 1>S PS EHRKWVE VPKACDEDHKG YliS RE 

nT?VP2Hn/MT T?r , VV ,, pc:'K'T1l , Vr>QVM«? < ?TNPNTSGTIjIjEGFItNIVRK 
Upr^X AV v VIXjz «j> I A.c oxvXCiV v no O x v* r j.> x ou j. uui jjv* jl v 

KKEAQR YRNEVRH I FTAFDT Y YRG FLTLED FKKAFRQVAPKLPE 
RTVLE VFRE V \ DRDS \ DGH VS F 


6108 


3 


1348 


GGSIiRFSPPRVPS.CSRVFCPVPPGGCGLPSPMSASRPQSPTTPW | 
CLPRRYMKHKRDDGPEKQEDEAVDVTPVMTCVFWMCCSMLVLL 1 
YYFYDLLVYWIGI FCLASATGL YSCI»APCVRR1.P\SASAGESA \ 
L1APTI PNNSLP YFHKRPQARMIXIiALFCVAVSVVWGVFRNEDQ 
WAWVLQDALG I AF CL»YMI»KT I RL,PTFKACII»LLIjVLFL YDI FFV 
FITP FliTKSGS S I WVEVATGPSDS ATRBKLPM VLKVPRLNSS P t* 
ALCDRPFSLLGFGDIIjVPGLIjVAYCHRFDIQVQSSRVYFVACTI 
AYGVGLLVTFVALALMQRGQPALIiYIjWCrrr.VTSCAVAIiWRREl> 
(jVr W i Vjij\jr AK. V Xjtr for W>\r>^rUAjJrW* ^"^"W^^^*" 11 
PATSPWPAEQS PKSRTSEEMGAGAPMREPGS PAESEGRDQAQPS 
PVTQPGASA 


6109 


1 


1381 


CRSRAGAASGGAI LEGTKLRRQR VDTNKP LD P L V P S ALRAAML.Y 
LEDYLEMIEQLPMDLRDRFTEMREMDLQVQNAMDQLEQRVSEFF 
MNAKKNKP EWREEQMAS I KKD Y YKALEDADEKVQIiANQ I YDL>VD 
RHLRKLDQELAKFKMEI>EADNAG ITE ILERRSLELDTPSQPVNN 
HHAHSHTPVEKRKYNPTSHHTTTDHI PEKKFKSEALbSTLTSDA 
SKENTLGCRNNNSTASSNNAYNVNSSQPLGSYNIGSLSSGTGAG 
GI \TMAAAQAVQATAQMKEGRRTSSLKAS YEAFKNNDFQLGKEF 
SMARETVGYSSSSAIjMTTLTQNASSSAADSRSGRKSKNNNKSSS 
QQSSSSSSSSSLSSGSSSSTWQEISQQTTWPESDSNSQVDWT 
YDPNEPRYCICNQVSYGEMVGCDTQDCPIEWFHYGCVGLTEAPK 
GKWYCPQCT\AAMKRRGSRHK 


6110 


77 


2464 


ACPSAATMSDQDHSMDEMTAWKIEKGVGGNNGGNGNGSGAFSQ 
ARSSSTGSSSSTGGGGQESQPS PLAI .1 AATCSR I ES pnensnns 
QGPSQSGGTGELDLTATQLSQGANGWQI I SSSSGATPTSKEQSG 
SSTNGSNGSESSKNRTVSGGQYWAAAPN1»QNQQVLTGLPGVMP 
NIQYQVI PQFQTVDGQQLQFAATGAQVQQDGSGOIQI I PGANQQ 
I ITNRGSGGNI IAAMPNLLO^AVPIXJGLANNVLSGQTQYVTNVP 
VAXiNGNITLIiPVNSVSAATLTPSSQAVTISSSGSQESGSQPVTS 
GTTISSASLVSSQASSS SFFTNANSYSTTTTTSNMG IMNFTTSG 
SSGTNSQGQTPQRVSGLQGSDALNIQQNQTSGGSLQAGQQKEGE 
Q\NQQTQAAPKSI .SRPQt,VQGG\QALQ\AFQAAPLSGQTFTTOA 
ISQETI^NI^LG^VPNSGPIIIRTPTVGPNGQVSWQTI^I*QNLQ 
VQNPQAQT I TIAPMQGVS LGQTS SSNTTLTPI ASAAS I PAGTVT 
VNAAQI»SSMPGLiQT I NLSALGTSG I QVHP I QGLPIA I ANAPGDH 
GAQLGX>HGAGGDG I HDDTAGGEEGENS PDAQPQAGRRTRREACT 
CPYCKDSEGRGSGDPGKKKQHICHIQGCGKVYGKTSHLjRAHLRVJ 
HTGERPFMCTWSYCGKRFTRSDEJLQRHKRTHTGEKKFACPECPK 
RFMR SDHLS KHI KTHQNKKGG PGVALS VGTLPLDSGAGSEGSGT 
ATPSALITTNMVAMEAICPEGIARLANSGINVKEGGQFCSPINT 
SANGF 
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SEQ 
ID 
NO: 

6111 


| Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L-Leucine, M=Methionine. N=AsDaraa-»ne 
P=Proline, Q-Glutamine, R=Arginine, 
S=serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6112 


1637 
77 


797 
196 


RVDPRVROAMAPWGKRIiAGVRGVLLDISGVLYDSGAGGGTAIAG 
SVEAVARLKRSRLKVRFCTNESQKSRAELVGQLQRLGFDISEQE 
VTAPAPAACQILKERGLRPYIiLlHDGV\ASEFDQIDTS /STPNC 
VVI ADAGES FS YQNKNNAFQVLMELEKPVLI SLGKGRYYKETSG 
LMLDVGP YM KALE YACG I KAEVGGK PS PE FFKS ALQA I GVE AHQ 
AVM I GDD I VGD VGGAQR CGMRALQVRTGKFR PSDEHH PE VKADG 
YVDNLAEAVDLLLQHADK 


6113 
6114 


1779 


567 


MSSHKSFKSKRFLAKKQKPNRPILQWI WLKTGNKIRHNWK " 
WEGRSWAACGVNIjQGAWGERSGVRASEAESPGKRADVSWWSRQL 
ETMVDH LANTE INSQRIAAVESCFGASGQPLALPGRVLLGEGVL 
TKECRKKAKPR I FFLFND I LVYGS I VLNKR K YRSQH 1 1 PLE EVT 
LELLP ETLQAKNR WM 1 KTAKKS F WS AAS ATERQE W I SH I EE C V 
RRQLRATGRPA\ STEHAAP WI PDKATD I CMRCTQTRFSALTRRH 
HCRKCRVWCAECSRQRFLLPRLSPKPVRVCSLCYRELAAQQRK 
EEAEEQGAGVPRAASHLARP I CGRPVEMTMTPTRTRRAAGTATG 
PAAWSSTPRGWPGLPSTADPR PAEHLS PS QLH CPG PQEGS S RS C 

PGLRDPIPWWQVQRWGVALSGLPVPFCWTLCPYGFTAGNAFPFq 
KPQNTHRSW 


6115 


818 


246 


PTSRPRPSPGSPAMSWSACVSAAPSSSWPASSSWPCGPRRCCTR 
RRRCSPRCGLAAGSMCSCSPSWRCTPVPACWPSPPP\PAEQVQC 
GHLPPHADRRALRLPVAAP ARG PGPGHPAGPAGPRPARTPPAS ° 
HGPGRP TVPAP P C PLLAATE PTPS R PHQR WTREDRMLGRGSQVT 
GRPQWFLRGLVLFSL 


6116 


324 


71 


DVCGRVCAHPHLYTH I HM H I CAHAC \ I HTHAQLC/ 1 TASHALAH 
SHLYTCMVMLTASHTPSHTHPHTAVHKEHRADVLRGTLTPLR ! 


6117 


595 


1430 


tgvmppgrwhaa/isssgpvfegara\lqtvkkbeedesytpvq 

AARPQTLNRPGQELFRQLFRQLRYHESSGPLETLSRLRELCRWIV 

lrpdvlskaqilellvleqflsilpgelrvwvqlhnpesgbe\l 

WPCWRS CRGTLMGHPGGTRALP\ EPRCALDGYRS \ LRSAQI WS L 
ASPLRSSSALGDHLEPPYEIEARDFLAGQSDTPAAQMPALFPRE 
GCPGDQVTPTRSLTAQLQETMTFKDVEVTFSQDEWGWLDSAQRN 
LYRDVMLENYRNMASLGK 


6118 


1433 


222 


VUVPSPAPPCSWEVGPGGGWTPGILKHGQGGRRTPLLLLATRTR 
GLLSLFPPAAMHPAAFPLPVWAAVLWGAAPTRGLIRATSDHNA 
SMDFADLPALFGATLSQEGLQGFL.VEAHPDNACSPIAPPPPAPV 
NGSVFIALLRRFDCNFDLKVLNAQKAGYGAAVVHNVWSNELLNM 
VWNSEEIQOQIWIPSVFIGERSSEYLRALFVYEKGARVLLVPDN 
TFPLG YYL I P FTG I VGLLVLAMGAVMI ARC IQHRKRLQRNRLTK 
\EQLKQI \ PTHDYQKGDQ YDVCAI CLDE YE DGDKLRVLPCAHAY 
HSRCVDPWLTQTRKTCPICKQPVHRGPGDEDQEEETQGQEEGDB 

GEPRDHPASERTPLLGSSPTLPTSFGSLAPAPLVFPGPSTDPPL 
SPPSSPVILV 


6119 


1044 


247 


STI S CRACTSGATPG AQSHRSARGHAA^fi KFT A AT /^towvr itktv v — 
KEKEKETQKEKIGEKGREEKVKRKEVEQKIKQEKQEKQERRKGK 
E KEEKRTKQGKETNKE KEQ FKGQEEKGENKDS TLTRTPLEPLEK 
NKQILVIjGLDGAGKTSVLHSLASNRVQHSVAPTQGFHAVCINTE 
DSQMEFLE IGGS KPFRSYWEMYLSN/ ADS LARS FSVGFKQDSQP 

itmkakkyt^qliaanpvlplvvfankqdleaayhitdiheala 




1217 


462 


DPRFVTENTTKAPAQERTTQPRSSREGTLRSTMEYliSALNPSDL" 

LRSVSNISSEFGRRVWTSAPPPQRPFRVCDHKRTIRKGLTAATR 

QEl^KALETLLIJ^GrvrLTLVLEEr3GTAVDSBDFFQJJUBDDTCIJ^ 

VLQSGQSKSPTRSGVLSYGLGRERPKHSKDIARFTFDVYKQNPR 

DLFGSI J NVKATFYGl,YSMSCDFOGL\GPKKVLRELi^WTSTLLQ 

GLGHMLLG ISSTLRHAVEGAEQWQQKGRLHSY 



449 



BNSDOCIO <wo_ 



0153312A1 J_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, bsFnenyiaianine, \3xyi~j-u*= , 
H^Histidine, I=Isoleucine, KsLysine, 
L=l»eucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *^Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6120 


785 


179 


LERAGGGGIaSSRALVGSGACI»S LVARANGKGLPRGRKEFVEAVR 
VR YVAFRYRTPRAVCLRLWS CRREVI MSGRGKQGGKVRAKAKSR 
S S RAGLQ FPVGRVHRLI>RKGNYAERVGAGAPVYIiAAVliEy IjTAE 
I LELAGNAARDNKKTR I IPRHLQLAIRNDEELNKLLGKVTI AQG 
G\ VLPNI QAVLLPKKTESQKDEGANDP 


6121 


1612 


107 


FVRAQARGSRQPVRRPLIiGAGSRLRCRSCGRMEPLKVEKFATAN 
RGNGLRAVTPLRPGELLFRSDPLAYTVCKGSRGWCDRCLLGKE 
KbMRCSQCRVAKYCSAKCQKKAWPDHKRECKCLKSCKPRYPPDS 
VRLLGR WF KLMDGAP S ESE KI.YS F YDLESNINKL.TED KKEGLR 
QLVMTFQHFMREEIQDASQLPPAFDLFEAFAKVICNSFTICNAE 
MQEVGVGLYPS ISLLNHSCDPNCS I VFNGPHLLIiRAVRDIEVGE 
EI/F I CYLDMLMTS EERRKQLRDQ YC FECD\ CFRCQTQD KDADML 
TGDEQVWKEVQESIiKKIEELKAHWKWEQVLAMCQAIISSNSERL 
PDINIYQLKVLDCAMDACINLGLLEBAIiFYGTRTMEPYRI FFPG 
SHPVRGVQVMKVGKI*QI*HQGMFPQAMKNLRliAFDIMRVTHGREH 
SLIEDLILLLE/AMRRQHQSILRERSQREIRRVSLLNALLRSHT 
LCFVSCVNLSYWKFCSVFV 1 


6122 


2 


2324 


RFRKMADGGAASQDESSAAAAAAADSRMNNPSETSKPSMESGDG 
NTGTQTNGLDFQKQPVPVGGAISTAQAQAFLGHLHQVQIAGTSL 
QAAAQSUWQSKSNEESGDSQQPSQPSQQPSVQAAIPQTQIiMIA 
GGQ I TGLTIiTPAQQQI»I»I*QQAQAQAQ1JjAAAVQQHSAS QQHSAA 
GAT I S ASAAT PMTQ I PLS Q P I Q I AQDLQQLQQLQQQNIiNLQQFV 
LVHPTTNLQPA\QFIISQTPQGQQGLLQA\QNI*LTQLPRQSQAN 
LIiQSQPRI\TI>TSQPATPTCTIAATPIQTLPQSQSTPKRIDTPS 
LEEP\SDLEELEQFAKTFKQRRIKLGFT\QGDAGbAMVKLYGND 
FSPTTIFRFEAiNLSFKNMCKLKPLl»EKWLNDAENLSSDSSIjSS 
PSALNSPGIEG1jSRRRKKRTSIEA\NIRVAIiEKSFIjEN\QKPTS 
EEITMIADQLNMEKGVIRVWFCNRRQKEKRINPPSSGG\TSSSP 
IKA3FPSPTSIiVATTPSLVTSSAATTLTVSPVl.PLTSAAVTNI»S 
VTGTSDTTSNNTATVISTAPPASSAVTSPSLSPSPSASASTSEA 
SSASE TSTTQTTSTPIjSS PIjGTSQVMVTASGLQTA/ AQLLPFKG 
AAQLPANASIiAAMAAAAGLNPSLMAPSQFAAGGAl»I*SLNPGTIjS 
GALSPALMSNSTLATIQALASGGSLPITSLDATGNIiVFANAGGA 
PNIVTAPLFLNPQNLSLLTSNPVSLVSAAAASAGNSAPVASLHA 
TSTSAES IQN2LFTVASASGAASTTTTASKAQ 


6123 


3 


2944 


HIjLHRWFGTDMQM I NFTTGE FQLTE AC P YI/5THS EESRFGI IjHI* 
HLQPLEMKRVGWFTPADYGKVTSLILIRNNDTVIDMIGVEGFG 
AREIjIiKVGGRLPGAGGSLRFKVPESTIfMDCRRQI»KDSKQII*S I T 
KNFKVENIGPLP I TVS SLKINGYNCQGYGFEVJ*DCHQFSLDPNT 
SRDI S I VFTPDFTS S WVI RD1»S LVTAADLEFRFTLNVTLiPHHLI» 
PLCADWPGPSWEES FWRLT VF FVSLS IiLGVXLI AFQQAQY ILM 
EFMKTRQRQNASSS SQQNNGPMDVIS PHS YKSNCKNFLDTYGPS 
DKGRGKNCLP VNTPQSRIQNAAKRS PAT YGHSQKKHKCSVYYS K 
UVTQTA a aqciT^TTTEEKOTS PLGSSLPAAKEDI CTDAMRENWI 
SI^YASGINV^rU3Kl^ J TLPKNLI^KEENTL^aSr^IVFSNPSSECS 
MKEGIQTCMFPKETDI KTSENTAEFKERELCPLKTSKICLPENHL 
PRNSPQYHQPDLPE ISRKNNGNNQQVPVKKBVDHCENLKKVDTK 
PSSEKKIHKTSREDMFSEKQDIPFVEQEDPYRKKKLQEKREGNL 
QNLNWS KS RTCRKNKKRGVAPVSR PPEQSDLKTjVCSDFERSEIjS 
S DINVRSWC I QESTRE VCKADAE I AS SIiPAAQREAEG Y YQKPEK 
KCVDKFCSDSSSDCGSSSGSVRASRGSWGSWSSTSSSDGDKKPM 
VDAQHFLPAGDS VSQNDFPSEAP I SLNLSHM ICNPMTGNSLPQY 
AE PS CPSLP AGPTGVEEDKGLYS PGDIjWPTP PVCVTS SLNCTLE 

kgvpcv1qesapvhnsfidwsatcegqfssaycplelndynafp 
eenmnyangfpcpadvqtdfidhnsqstwntpp\nmpas\wgna 
qfpsssrpyi,kstpkaclpmsglfgpi\wap\qsdvyenccpin 
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SEQ 
ID 

NO; 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I>=Leucine, M=Methionine, N=Asparagine, 
P==Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=po3sible nucleotide insertion) 


6124 






PTTEHSD/ THMENQA\WCKEyYPGF\WPFRAYMNLDIWTTT\A"~ 
NRNAN FPbSRDSS YCGNV 


6125 


1S73 


236 


S DEAIiRLAGERGMGR VQ1> FE I S LSHGR WYS PGE PLAGT VR VRX> 
GAPLPFRAIRVTCIGSCGVSNKANDTAI^VVEEGYFNSSLSLADK 
GSLPAGEHSFPFQFLLPATAPTSFEGPFGKIVHQVRAAIHTPRF 
S KDHKCS LVFY ILS PLNLNS I PDIEQ PNVASATKKFS YKLVKTG 
SWLTASTDLRGYWGQALQLHADVENQSGKDTSPWASIiLQKV 
S YKAKRW I HDVRT I AE VEGAG VKA WRRAQWHEQI LVPALPQSAL 
PGCS L I H I D Y YI*Q VSLKA PEATVTLPVF I GNI AV /N PCPSE P PA 
R PGAAS WG PTPGG \ PSAPPQEEAEAEAAAGGPHFLDP VFLSTKS 
HSQRQPLLATLSSVPGAPEPCPQDGSPASHPLHPPLCISTGATV 

PYFAEGSGGPVPTTSTLILPPEYSSWGYPYEAPPSYEQSCGGVE 
PSLTPES 


6126 


1 


904 


KTCP KLTCAFTVS VP DS CCR VCRGDGEIiS WEHSDGD I FRQ PAN R 
EARHS YHRSH YD PP PSRQAGGLSR F PG ARS HRGALMDSQQASGT 
IVQIVINNKHKHGQVCVSNGKTYSHGESWHPNTiRAFGlVECVLC 
TCNVTKQECKK1HCPNRYPCKYPQKIDGKCCKVCPG /KKAKEEL 
PGQSFDNKGYFCXSEETMPVYESVFMEDGETTRKlAIiETERPPQV 
E VHVWTI RKG I LQH PHI E KI S KRM FEELPHFKIiVTRTTLS QWKI 
FTEGEAQ I S QMCSS R VCRTELEDLVKVIj YLERSEKGHC 


6127 


1224 


389 


RLLSEAPCPRSRRRFQMNPEWGQAFVHVAVAGGLCAVAVFTGIF - 

DSVSVQVGYEHYAEAPVAGLPAFIiAMPFNSJbVNMAYTLLGLSWI. 

HRGGAMGIiGPRYLKDVFAAP^LYGPVQWLRLWTQWRRAAVLDQ 

WLTLPIFAWPVAWCLYLDRGWRP\WI>FLSLECVSLASYGLALLH 

PQGFEVALGAHWPAVGQAURT\HRHYG/SATPSATYIALGVLS 

CLGFVVLKXCDHQLARWRLFQCLTGHFWSKVCDVLQFHFAFLFI, 

THFNTHPRFHPSGGKTR 


6128 


1335 


463 


VLPRRCLVFWNTMDSSREPTU3RLDAAGFWQVWQRFDADEKGY 
I EEKELDAFFLHMLMKLGTDDTVMKANLHKV KQQ FMTTQDAS KD 
GRIRMKELAGMFLSEDENFLIjLFRRENPI.DSSVEFMQIWRKYDA 
DSSGF I SAA£I*R1JFLRJDLFIjHHKKAXSEAKI»EEYTGTMMKI FDR 
NKDGRLDI^IJu^IIJu^QENFLLQFKMDACSTEKRKGDFEKIFA 
YYDVSKTGALEGPXEVDGFVKDMMELVQPSISGVDLDKFREIDL 

rhcdvnkdgkiqkselalclglkinp 


6129 


2511 


843 


TCRMSRRQLERW VWS SQQ VQARGRNVRAPRLGKI AMGI*BMS S KD 
SPGSLDGRAWEDAQKPQSAWCX3GRKTRVYATS SRRA PPS EGTRR 
GGAAR PEKTAEEGPPAAPGSIiRHSGPLGPHACPTAI* PEPQ VTSA 
MSSQWGIEPLYIKAEPASPDSPKGSSETETEPPVALAPG\PAP 
TRCLPGHKEE EDGEGAG PG EQGGGKL VLSSLPKRLCLVCGDVAS 
G YHYGVAS CEACKA FFKRT I QGS I EYS CPASNECE I TKRRRKAC 

qacrftkclrvgmlkegvrldrvrggrqkykrrpevdplpfpgp 

FPAGPIiAVAGGPRKTAAPVNALVSHLLWEPEKLYAMPDPAGPD 

GHLPAVATLCDIiFDREIWTISWAKSIPGFSSLSLSDQMSVLQS 

VWMEVLVLGVAQRSLTLQDELAFAEYLVIiDEEGARPAGLGELG\ 
AALLQLVRR DQALRIiEREK Y vt .T & T .a t. a kt cnwni t pnpDDT wet 

SCEKL^EAIiEYEAGRAGPGGGAERRRAGRLLLTLPLLRQTAG 

kvi^fygvklegkvpmhklflemleammd 


6130 " 


1764 
3 


771 
577 


ARFARSAHEGKMP KKKTGARKKAENRREREKQLRASRSTIDIiAK 
HPCNASMECDKCQRRQKNRAFCYFCNSVQiCLPICAQCGKTKCMM 

kssdcvikhagvystglamvgaicdfceamvchgrkclsthaca 
cpltdaec\vecergvwdhggri FSCSFCHNFLCEDDQFEHQAS 
cqvleaetfkcvscnrlgohsclrckacfcddhtrskvfkqekg 
kqppcpkcghetqetkdlsmstrslkfgrqtggeegdgasgyda 
ywkni>ssdxygdtsyhdeeedeyeaeddeeeedegrkdsdtess 
dlftnlnlgrtyasgyahyeeqen 

5RGGTMREYKVWLGSG\GVGKSALTvVQFVTCTFIEkYDPTI E 
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SEQ 
ID 
NO: 


Predicted 

Vvf*o i Tin *i tier 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

A 1 H l— -1- CSV.J l_ xuc 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

( A — an i n#» C— (Vfibpi n^ T"l— Ae*tiAt"t* A e* Afid P"— 

In— rVXaliUlC ^ — AC f — A3 h-' a JL \_ -J- V* XT* — ■ 

Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P- Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DFYRKEIEV\DSSPSVAGISWTQQGTEQF\ASMRDLYIKKGQGC 
ILVYSIiVNQQSFQXDIKPMRDQIIRVKVSEKVPVlXLVGNXSVD 
LE S ERE VSS SEGRAIiAEEWGCP FMETSAKSKTMVDEL FAE I VRQ 
MNYAAQPDKDDPCCSACNIQ 


6131 


3 


1811 


SSPREKTSDSSHRPSRHGFLFLRLVGLSPFSYLCVPPSRPVPGS 

G VAAGTRRPNVVLLLTDDQDEVLGGMTP LKKTKAL IGEMGMTFS 
SAYVPSALCCPSRASIIiTGKYPHNHHWNNTLEGNCSSKSWQKI 
QE PNTFPAI LRSMCGYQTFF\AGKYLNEYGAPDAGGLEHVPLGW 
S YWYALEKNSKYYNYTLS INGKARKHGEN YSVDYIiTDVLANVSL 
DFLDYKSNFEPFFMMTATP \APHS PWTAAPQYQKAFQNVFAPRN 
KNFN IHGTMKHWLIRQAKTPMTNS S IQFLDNAFRKRWQTLLSVD 
DL VE KLVKRLE FTGELNNT YI FYTSDNGYHTGQFSLP I DKRQLY 
EFDI KVPLLVRGPG I KPNQTSKMbVANIDLGPTI LDI AGYDLNK 
TQMDGMSLLP I LRGASNIjTWRSDVIjVEYQGEGRNVTDPTCPSIjS 
PGVSQCFPDCVCEDAYNNTYACVRTMSALWNLQYCEFDDQEVFV 
BVYNIjTADPDQITNIAKTIDPELLGKMNYRLMMLQSCSGPTCRT 
PGVFDPGYRFDPRLMFSNRGSVRTRRFSKHLL 


6132 


96 


1241 


AAGI»L.PPGI*VPEDPRRTRNLIiPFGIQGPPFALSRPliFSCVESGW 
AWEAMEPEFLYDLLQLPKGVEPPAEEELSKGGKKKYLPPTSRKD 
PKFEELQKPA\VLMEWINATLLPEHIWRSLEEDMFDGLILHHL 
FQRI*AALKLEAEDIALTATSQKHKLTWLEAVNRS\CSWRSGRP 
SGA/WESIFNKDLLSTLHLLVALAKRFQPDLSLPTNVQVEVITI 

estksglkseklveqlteystdkdeppkdvfdelfkiapekvna 

VKEAI VNFVNQKLDRIiGIjS VQNLDTQFADG VI LLLLIGQLEGFF 
LHLKEFYLTPNSPAEMLHNVTLALEIJj/IGRGPAQLPC/LALK/ 

tivnkdakstlrvlyglfckhtqkahrdrtphgapn 


6133 


2 


4256 


fvhgsmadtdiifmeceeeelepwqkisdviedswedyksvdkt 
ttvsvsqqpvsapvpiaahasvagklststtvsssgaqnsdstk 
ktlivtliannnagnplvqqggqpliiitqnpapglgtmvtqpvlr 
p vqvmqnanhvtss p vasq pi fittqgfp vrnvrp vqnamnqvg 
ivlnvqqgqtvrpitlvpapgtqfvkptvgvpqvfsqmtpvrpg 
stmpvrpttntfttvipatltirstvpqsqsqqtkstpststtp 
tatqptsi^iavqspgqsnqttnpkiapsfpsppavsiasfvt 
vkrpgvtgensnevaklvntlntipslgqspgpwvsnnssah\ 
gsqrtsg pessmkvtss i p vfdlqdggrki cprcnaqfrvteai* 

RGHMCY CCPEMVE YQKKGKSIjDS EPS VPS AAKP PS P EKTAPVAS 

/THPS STP I PALS P P Y /TKVPE PNENVGDAVQTKIj I MLVDDFYY 

GRIXSGKVAQLTNFPKVATSFRCPHCTKRIiKl^ 

IiDQQNG EVDGHT ICQHCYRQFST PFQLQCHLENVHS P YESTTKC 

KI CEWAFES E PLFI>QHMKDTHKPGEMP yvcqvcq YRS SLYSEVD 

VHFRMIHEDTRHLI>CPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 

CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 

SRGQPRTVP VS SNDTP PSALQEAAPLTS S MDPLP VFItYPP VQRS 

IQKRAVRKMS VMGRQTCLECS FE I PDFPNHFPTYVHCSLCR YST 

CCSRAYANHMINNHVPRKSPKYIiALFKNSVSGIKIiACTSCTFVT 

S VGDAMAKHLVFNPSHRSSS I LPRGLTWI AHSRHGQTRDRVHDR 

NVXNMYPPPSFPTNKAATVKSAGATPAEPEELLTPLAPALPSPA 

S TATPPPTPTHPQALAIiPPLATEGAECLiNVDDQDEGS PVTQE PE 

IJ^GGGGSGGVGKKEQLSVKIUjRVVXiFALCCNTEQAAEHFRNPQ 

RRIRRWLRRFQASQGENIiEGKYLSFEAEEKIiAEWVLTQREQQLP 

VNEETLFQKATKIGRSIiEGGFKISYEWAVRFMIjRHHIjTPHARRA 

VAHTIiPKDVAENAGLFIDFVQRQIHNQDLPLSM I VAIDE I SI*FL 

DTEVLSSDDRKENALQWGTGEPWCDVVI*AILADGTVI,PTLVFY 

RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 

RSKGMLVNIDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIQPL 
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SEQ 
ID 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, /-Stop 
Codon, /=possible nucleotide deletion, 
\=pocsible nucleotide insertion) 








DVCIKRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLVWLGEV 
IiGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYG FEEADLDLMEI 


6134 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 
TTVSVSQQPVSAPVPIAAriASVAGHLSTSTTVSSSGAQNSDSTK 
KTI>VTL.I ANNNAGNPLVQQGGQPL I IiTQNPAPGLGTMVTQPVLR 
P VQVMQNANHVTSS PVASQPI FITTQGFPVRNVRPVQNAMNQVG 
IVLNVQQGQTVR P I TLVPAPGTQF VKPTVG VPQ VFSQMTP VRPG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTSLGQIiAVQSPGQSNQTTNPKLAPSFPS PPAVS IAS FVT 
VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPWVSNNSSAH\ 
GSQRTSGPESSMKVTSS I PVFDLQDGGRKI CPRCNAQFRVTEAL 

RGHMCYCCPEMVEYQKKGKSLDSBPSVPSAAKPPSPEKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIPOLVDDFYY 
GRDGGKVAQLTNFPKVATSFRCPHCTKRLKNNIRFMNHMKHHVE 
C-DQQNGEVDGHTI CQHCYRQFSTPFQLQCHLENVHS PYESTTKC 
KI CE WAFES E PI»FLQHMKDTH KPGEMP YVCQVCQ YRS S LYS E VD 
VHFR M I HEDTRHLLCP YCLXVFKNGNAFQQHYMRHQKR \ NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGOPRTVPVSSNDTPPSAIiQEAAP]JTSSMDPLPVFL>YPPVQRS 
IQKRAVR KMS VMGRQTCLECS FE IPDFPNHFPTYVHCSLCR YST 
CCSRAYANHMINNHVPRKS PKYUVLFKNSVSG I KLACTSCTFVT 
S VGDAMAKHLVFNPSHRSSS IliPRGLTWIAHSRHGQTRDRVHDR 
NVKNMYP PPS FPTNKAATVKSAGATPAEPEELLTPLAPALPS PA 
STATPPPTPTHPQAIJU.PPIJVTEGAECIJJVDE)QDEGSPVTQEPE 
LASGGGGSGGVGKKEQLSVKKLRVVLFALCCNTEQAAEHFRNPQ 
RRIRRWLRRFQASQGENLEGKYLSFEAEEXLAEWVLTQREQQIiP 
VNEETLFQKATKIGRSLBGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 
DTEVLS S DDRKENAIiQTVGTGEP wcdwiai ladgtvlptlvfy 
RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
RSKGWIiVMDCHRTHIjS EEVLAMLSAS STLPAVVPAGCSSKI QPI* 
DVC I KRTVKNFLHKKW KEQAREMADTACDSDVIjI,QLVLVWLGE V 

lgvigdcpelvqrsflvasvlpgpdgninsptrnadmqeelias 

leeqlki»sgehsesstprprsspeetiepeslhqi»fegesetes 
fygfeeadldlme I 


6135 


2 


4256 


fvhgsmadtdlfmeceeeelepwqkisdviedswedynsvdkt 
ttvsvsqqpvsapvpiaahasvaghiiststtvsssgaqnsdstk 

KTLVTIj I ANNNAGNPI*VQQGGQ Pit I LTQN PAPGLGTMVTQP VLR 
PVQVMQNANHVTSSPVASQPIFITTQGFPVRNVRPVQNAMNQVG 
I VLNVQQGQTVRP ITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTS LGQLAVQS PGQSNQTTNPKLAPS FPS PPAVS I AS FVT 
VKRPGVTGENSNEVAKLVNTLNTI PSLGQS PGPVWSNNSSAH \ 
GSQRTSG PESSMKVTSS I PVFDLQDGGRKI C PRCNAQFR VTEAL 
RGHMCYCCPEMVE YQ KKGKS LDS E PS VPS AAKP PS PEKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQIiTNFPKVATS FRCPHCTKRLKNN I RFMNHMKHHVE 
LDC^NGETOGHTICQHCYRQFSTPFQI^CHLEMVHSPYESTTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSBVD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFIiFAKDKIEHKLQHHKTFRKPKQLEGIjKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSAIiQEAAPIiTSSMDPLPVFLYPPVQRS 
IQKRAVRKMSVMGRQTCLECSFEIPDFPNHFPTYVHCSLCRYST 
CCSRAYANHMINNHVPRKSPKYLALFKNSVSGI KLACTSCTFVT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycin<* , 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, .M=Methionine, N^Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SVGDAMAKHLVFNPSHRSSSILPRGLTWIAHSRHGQTRDRVHDR 
NVKNMYPPPS FPTNKAATVKSAGATPAEPEELLTPIiAPALPS PA 
STATPPPTPTHPQAIiALPPIiATEGAECLNVDDQDBGSPVTOEPE 
LASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHFRNPQ 
RR I RR WLR R FQASQG ENLEG KYLS FEAEEKLAEWVLTQREQQLP 
WEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLF IDFVQRQ I HNQDLPLSMIVAI DEISLFL 
DTEVLSSDDRKENALQTVGTGEPWCDVVXiAIIiADGTVLPTIjVFY 
RGQMDQPANMPDS ILLEAKESGYSDDEI MELWSTRWQKHTACQ 
RSKGMLVMDCHRTHLSEEVLAMliSASSTLPAWPAGCSSKIQPL 
DVCIKRTVKNFIiHKKWKEQAREMADTACDSDVLLQLVLVWLGEV 
LGVIGDCPEI.VQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


6136 


1704 


539 


FGVRMALEGMSKRKRKRSVQEGENPDDGVRGSPPEDYRiGQVAS 
SLFRGEHHSRGGTGRIASLFSSLEPQIQPVYVPVPK\ESALASA 
DLEEEIHQKQGQKRKNSQPGVKVADRKILJDDTEDTVVSQRKKIQ 
INQEEERLKNERTVFVGNLPVTCN KKKLKS FFKEYGQ I E SVRFR 
SLI PAEGTLS KKLAAI KRKIHPDQKNINAYWFKEESAATQALK 
RNG AQ I ADG FRI RVD LAS ETSSRDKRS VFVGNLP YKVEE S AI EK 
HFLDCGS I MAVR I VRDKMTGIG KG FG YVLFENTDS VHIALKLNN 
SELMGRKLRVMRSVNKEKFKCKJNSNPRLKNVSKPKQGLNFTSKT 
AEGHP KSLF I GEKAVLLKTKKKGQKKSGRPKKQRKQK 


6137 


141 


2656 


RALRKRRCGPGRRGALGSGPGPQRRPGRVPEERPAPPRERKHPG 
MWNMLIVAMCIA\LLGLPGKAQELQGHVS\I ILAGEQLGDLAKK 
YLWQG VLFQLYLDEAGRGHS FSFHGAALTAPKQGQELMAKALES 
LSCP KDMAPSHCAEH KDQ FLQLSQYRQLKTAEDYQALN KD 1 EAQ 
LQHAGIiREAGG I FYFS VP PFAYEDIARNINSSCRPGPGAWLRW 
LEKP FGHDHFS AQQ tiATELGTFFQEEEM YRVDHYLGKQAVAQ 1 1» 
PFRDQNRKAIiDGLWNRHHVERVEI IMKETVDAEGRTSFYEE YGV 
IP3VLQNHLTEVliTIaVAMEL?KlWSSAEAVL,RHKl»QVFOALRGL 
QRGSAWGQYQS YSEQVRRE£>QKPDSFHSLTPTFAGVLVH I DN L 
RWEGVPFILMSGKALDERVGYARILFKNQACCVQSEKHWAAAQS 
QCtiPRQLVFHIGHGDLGSPAVLVSRNLFRPSLPSSWKEMEGPPG 
LRLFGS PLSDYYAYS PVRERJDAHSVLLSHI FHGRKNFFITTENL 
LASWNFWTPLLESLAHKAPRLYPGGAENGRLLDFEFSSGRLFFS 
QQQPEQLVPGPGPGPMPSDFQVLRAKYRESSLVSAWSEELISKL 
ANDI EATAVRAVRRFGQF1ILALSGGS S PVALFQQLATAHYG FPW 
AHTHLWLVDERCVPL3DP ESNFQGLQAHIiLQHVR I PY YNI H\AM 
PVHLQQRLCT^EDQGAHI YARE ISALGANSS FDLVLLGMGADGH 
TASLF PQS PTGLDGEQLWLTTS PSQ PHRRMSLS L PL INRAKKV 
AVLVMGRMKRE I TTLVSR VGHE PKKWP ISGVLPHSGQLVWYMDY 
DAFLG 


6138 


4587 


934 


EFSKLTDRWQNAVQGVRQRKGDVDGLVRQWQDFTTSVENLFRFL 
TDTSHLLS AVKGQERFSLlf QTRSLIHELKNKE IH FQRRRTTCAL 
TLEAGEKLLLTTDLKTKESVGRRISQLQDSWKDMEPQLAEMIKQ 
FQS TVETWDQCEKKI KELXSRLQVLKAQSEDPLPELHEDLHNEK 
ELI KELEQ S LAS WTQNLKELQTM KADLTRHVLVEDVMVLK EQ I E 
HLHRQWEDLCLRVAIRKQEIEDRLNTWWFNEKNKELCAWLVQM 
ENKVLQTAD I S I EEM I EKLQKDCME EINLFSENKLQLKQMGDQL 
I KASNKS RAAEI DDKLNK.INDRWQHLFDVI GSRVKKLKET FAF I 
QQLDKNMSNLRTWLARI ESELSKP WYDVCDDQEIQKRLAEOQD 
LORD IEQHSAGVES VFNI CDVLLHDS DACANETECDS I QQTTRS 
LDRRWRNI CAMSMERRMKIEETWRLW QKFLDDYSRFEDWLKSAB 
RTAACPNSSEVLYTSAKEELKRFEAFQRQIHERLTQLELINKQY 
RRLARENRTDTASRLKQMVHEGNQRWDNLQRRVTAVLRRLRHFT 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycirie, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NQREEFEGTRESIIjVWLTEMDLQLTNVEHFSESDADDKMRQLNG 
FQQEITLNTNKIDQLIVFGEQLIQKSEPXLDAVl.IEDErjEELHR 
YCOKVFnRvr^PFTIPT?T.T^r^pnT.wnT7TTTra CTTNTTT^riMTrriD-Dt7 TrxTi 

* *-V" VI VJIVV OX\I IJIvivu L i3L X ■ ■"■ \ • "■ r\ ~- ~- F' 3 r- J. Ul'lliiJfc'XiS ly A 

dswrkrgeseepss pqslchlvapghersgcetpvs vds \iple 
wdhtgrrggpsssh\eedeeaqyy\salsgksisdghswhvpds 
pscpehhykqmegdrnvppvppasstpykppygklllppgtdgg 
kegprvlngnpqqedgglagiteqqsgafdrwemiqaqel\hnk 
lki kqnlqqlnsdi saittwlkkteaelemlkmakppsdiqeie 
lrvkrlqe i lkafdt ykal ws vnvsske flqtes pestelqsr 
lrqlsllweaaqgavdswrgglrqslmqcqdfhqlsqnlllwla 

S AKNRRQ KAHVTDP KAD PRALLECRR ELMQLE KELVERQPQVDM 
x»y & x ojv o AjJj x iMjtUj c.Li t_ L te*Ahj IS R.VnV X \ h JlJsXt t%.y JuKEQ V SQDIjM 
ALQGTQN PAS PL PS FDEVDSGDQPPATS V PAPRAKQFRAVRTTE 
GEEETESRVPGSTRPQRSFLSRVVRAALPLQLIJLLLLLLLACLL 
PSSEEDYSCTQANNF\ARSFYPMLRYTNGPPPT 


6139 


52 


1131 


LGDWVWSRTCGVLETPTSVLRRARARGPCPTDSKWALPRLREGE 
TERR PWEAS SWKTL/LAGWIGGAAS VI VGHPLDTVKTRLQAGVG 
YGNTLS C I R W YRR ES MFG FFKGMS FPLAS I AVYNS WFGVFSN 
TQRFIjSQHRCGE PEAS PPRT LS DLLLASMVAG WS VG LGGP VDIj 
I KI RLQMQTPPVSGRQPRFEVQGSGSCG \EPAYQGPVHCITTI V 
PwNEGLAGI.YRGASAMLl,RDVPGYCI.YFIPYVFIiSEWlTPEACTG 
PSPCAVWIiAGGMAGAI S WGTAT PMDVVKSRLQADGVYX»NKYKGV 
LDCISQSYQKEGLKVFFRGITVI^VRGFPMSAAMFLGYELSLQA 
IRGDHAVTSP 


6140 


634 


IOC 

JL J D 


RPELELWRIjRSRSWR PLG V PRRCHRRNWKEPVRAQPLS VTVWAP 
RCQRP/ QPPAPEPSSPNAAVPEAI PTPRAAASAALELPLGPAPV 
S VAPQAEAEARSTPGPAGS RLGPET FRQRFRQFR YQDAAGPREA 
FRQLREL/SPRQWLRPDI \RTKEQ\ IVEMI1VQEQLI1AILPEAAR 


6141 


2 


984 


AQVGPRSRPCKMPLKLRGKKKAKSKETAGLVEGEPTGAGGGSLS 
ASRAPARRL VFHAQLAHG SATGRVEG FSS IQ EL YAQIAGAFEIS 
PSE I LYCTLNTPKIDMERLLGGQLGLEDFI FAHVKGIEKEVNVY 
KSEDSLGLTI TDNGVG YAF I KRI KDGGVIDSVKTICVGDH I ESI 
NGEN I VGWRH YDVAKKLKE LKKEELFTM KL I E P KKAFE I E LRS K 
AGKSSGEKIGCGRATLRLRSKGPATVEEMPSETKAK\AIEKIDD 
VLEL YMG IRD I DLATTMFEAGKDKVNPD EFAVALDETLGDFAFP 
DEFVFDVWGVIGDAKRRGL 


6142 


116 


602 


EAEGEQVCGAKCCGDAPHVENREEETAR IGPG VMES KEERALNN 
L IVENVNQENDEKDEKEQVAN KGE P1*ALPLNVSEYCVPRGNRRR 
FR VRQP I LQ YRWD I MHRLGE PQARMR EENMERIGEE VRQLME KL 
REKQLSHSLRAVSTDPPHHDHHDEFC\LMP 


6143 


2802 


270 


FRMR I FLHCP WNQQMWKI WNLLETSLESCKAHLS I QKLLKER \Q 
\QLPVFKHRDSIVETLKRHRVVVVAGET\GSGKSTQVPHFLLED 
LLLNE WEASKCNI VCTQ PRRI SAVS LANRVCDELGCENGPGGRN 
SLCGYQIRMESRACESTRLLYC1"JGVLLRKLQEIX5LLSNVS/HM 
FI VDEV\HER\S VQSDFLL 1 1 LKEILQKRSDLHLI LMSATVDSE 
KFST YFTHCP I LR I SGRS YP VEVFHLEDI I EETGF VLEKDSE YC 
QKFLEEEEEVTI NVTSKAGG I KKYQEYI P VQTGAHADLNPFYQK 
YSSRTQHAI LYMNPH KINLDL I LELLAYLDKSPQFRNI EGAVL I 
FLPGLAH I QQL YDLLSNDRR FYS ERY KV I ALHS I LS TQDQAAAF 
TLPP PGVRKI VLATNIAETG I TI PDWFVIDTGRTKENKYHES S 
QMSS LVETFVS KAS ALQRQGRAGR VR DG FCFRM YTRERFEGFMD 
YSVPEILRVPLEELCLHIMKCNLGSPEDFLSKALDPPQLQVISN 
AMNLLRKIGACELNE PKLTPLGQHLAALP VNVK IGXML I FGA1* F 
GCLDP VATLAAVMTEKS PFTTP I GRKDEADLAKSALAMADSDHL 
TI YNAYLGWiGCARQEGG YRS E I TYCRRN FLNRTSLLTLEDVKQE 
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SEQ 
ID ] 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 1 
to first 
amino acid 
residue of 1 
amino acid 
sequence | 


Predicted end j J 
nucleotide j 
location 1 < 
corresponding 1 1 
to first I 
amino acid ] 
residue of 
amino acid 1 
sequence j 


\mino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
-m-\ __4 Ti^i j p-phpnvl alanine. G=Glvcine , 
l=Histidine, I=lsoleucine, K~Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine / T=Threonine, V=Valine, 
^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIKLVKAAGFSSSTTSTSWEGKRASQTIiSFQEIALLKAVLVAGL 
YDNVGKI I YTKSVDVTEKLACIVETAQGKAQVHPSSVNRDLQTH 
GWLLYQE K I R YARVYltRETTL I TP FPVLLFGGD I E VQHRERI»I»S 
IDGWI Y FQ AP VKI AV I F KQLRVLI DSVLRKKLENP KMSLENDKI 
LQI ITELI K.TENN 


6144 


1289 j 
* i 


568 | 


SGPGSMSGQRVDVKWMLGKEYVGKTSLVBRYVHDRFLVGPYUN 

VSASGGARHGGRGSGGPVTui. ltek'Lfuc jrjjVA\i J-umac v«jw» 
VGDRTVTLGIWDTAGSERYEAMSRIYYRGAKAAIVCYDX.TDSSS 
FERAKFWVKELRSI»EEGCQIYI>CGTKSDLLEEDRRRRRVDFHDV 
QDYADNIKAQLFETSSKTloQbvi^fcjjrv^v^^^ 1 v£»vjwr ^v* 
DKGVDLGQKPNPYFYSCCHH 


6145 


1109 


196 


GGMDLSELERDNTGRCRbSSPVPAVCRKEPCVLGVDEAGRGPVIj 
G PMVYAI C Y CPLPRIiADLiE ALKVADSKTLLE S ERERLFAKMEDT 
DFVGWALDVLS PNDI STSMLGRVKYNIiNSLSHDTATGIjIQYALD 
QGVNVTQVFVDTVGMPETYQARLQQSFPGIEVTVKAKADAIjYPV 

\vsaasicakvardqavkkwqfveklqdldtdyg\sgypndpqd 

/TKAWLKEHVEPVF\GFP\QFVRF\SWRTAQTI\LEKEAEDVIR 
EDSASENQEGLRKITSYFLNEGSQARPRSSHRYFLERGLESTTS 

L 


6146 


428 


781 1 


" LKKKGKEKAEAQQVEALPGPSIiDQWHRSAGEEEDGPVI>TDEQKS 
R/YPGHEAHDQGG\WDARQSI IRKWDPETGRTRliI KGDGEVBE 
EI VTKERHRE INKQATRGDCLAFQMRAGLLP 


6147 


1 


2304 


GTRQIjPPPSPGSGPGD'SPEGPEGEAPERRRKAHGMLKLYYGIjSE 
GEAAGRPAGPDPLDPTDLNGAHFDPBVYLDKLRRECPIiAQIjMDS 
ETDMVRQIRALDSDMQTLVYENYNKFISATDTIRKMKNDFRKME 
DSMDRI^TI^VITDFSARISATI^DRHERITKIAGVHALIjRKL 
1 QFLFELPSRLTKCVELGAYGQAVRYQGRAOAVI>QQYQHLPSFRA 
IQDDCQVITARIiAQQj^QRFREGGSGAPEG^ECVEIjIj1iAL.GEPA 
EELCEEFljAHARGRIjEKELRNbEAEl»GPSPPAPDVI>EFTDHG\S 
SGFVGGLCQVAAAYQELFAAQGPAGAEKIAAFARQLGSRYFALV 
ERRLAQEQGGGDNSI*LVRALDRFHPJU,RA^ 

EIVERVARERi^HHIXJGIJtAAFj^CLTDVRQAIAAPRVAGKEGP 
GLAEI*IANVASSILSHIKASLAAVHLFTAKEVSFSNKPYFRGEF 
CSQGVREGLI VGFVHSMCQTAQSFCDS PGEKGGATPPAI>I»IjIjIjS 
RI^LDYETATISYILTLTDBQFLVQDQFPVTPVSTIjCAEARETA 
RRI^THYVKVQGLVISQMLRKSVETRDWLSTIiEPRNVRAVMKRV 
VEDTTAIDVQVLPRIAGVALTQAGGTVPSRGAGAAEDHWQSLPG 
GGDMCIWASHGASSVARASVREPQGNKSPRMNTKRAGECLCPRS 

csfsaqdydifapilpvekqrlrvtqevraglvlvlkirpqtns 
1 cilplphstgsinsdhvptk 


6148 


J 3056 


1 353 


VPAVGGTFADGAMGEAEKFHYIYSCDIjDINVQLKIGSLEGKREQ 

ksykavledpmlkfsglyqetcsdlyvtcqvfaegkplalpvrt 

SYKAFSTRWNWNEWLKLPVKYPDLPRNAQVAIiTIWDVYGPGKAV 

pvggttvs lfgkygmfrqgmhd lkvw pn crsqmdqkptkt pgrt 

SSTIjS EDQMSRLAKLTKAHRQGHMVKVDWLDRLTFRE I EM INES 

vkrssnfmylmggfrcvkcddkeygivyyekdgdesspiltsfe 

LVKVPDPQMSLENLVESKHHNLPRSLRSGPSDHDLKPYPSPRDQ 
LKNIVSYPPSKPPTYEEQDLVWEFRYYIiTNQDKALTKILTSVIW 
DLP0^AKQA1^LGKWKPMDVEDSLEI»LSSHYTNPTVRRYAVAR 
LRQADDEDLLM YIXQLVQAL KYENFDD I KNGLEPTKKDSQSS VS 

eiwsnsginsaeidssqiit/sapfpsvssppp\asktkevpdg 

ENLEQDIiCTFLI SRAS KiNf STLAIJ YLYW YVI VECEDQDTQQRDPK 
THEMYLNVMRRFSQALIjKGDKSW 

kavqresgnrkkknerlqallgdne kmnls dveli pi*plepqvk 

IRGI I PETATLFKSALMPAQLFFKTEDGGKYPVIFKHGDDIjRQD 
QLII^IISLMDKLLRKENIjDLKIjTPYKVIiATSTKHGFMQFIQSV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location, 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R«Arginine, 
S-Serine, T=Threonine, V= Valine, 
W^Tryptophan, Y^yrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PVAEVLDTEGS IQNF FRKYAPSENG PNG I S AEVNDT YVKS CAG Y 
CVITYILGVGDRKLDNLLLTKTG KLFH I DFGYILGRDPKPLP PP 
MKLNKEMVEGMGGTQSEQYQEFRKQCYTAFLHLRRYSNIilLNLF 
S LM VDANI PDIALEPDKTVKKVQDKFRLDLSDEEAVHYMQSLI D 
ESVHALFAAWEQIHKFAQYWRK 


6149 


l 


1413 


RVDPRVRENGTANP I KNGKTS PAS KDQRTG KKTS VQGQ VQ KGN D 
ESESDFESDPPSPKSSEEEEQDDEEVLQGEQGDFNDDDTEPENL 
GHRPLLMDSEDEEEEEKHSSDSDYEQAKAKYSDMSSVYRDRSGS 
G P TQDLNT I I>LTSAQ1»SS D VA VETPKQE FD VFGAVP FFAVRAQQ 
PQQEKNEKNLPQHRFPAAGLBQEEFDVFTKAPFSKKVNVQECHA 
VGPEAHTI PGYPKS VDVFGSTP FQPFLTSTSKSESNEDLFGLVP 
FDEITGSQQQKVKQRSLQKLSSRQRRTKQDMSKSNGKRHHGTPT 
S TKKTLKP T YRTPERARRHKKVGRRDSQS SNE FLT I S DS KEN I S 
VALTDGKDRGNVLQPEESbbDPFGAKPFHSPD\LSWHPP\HQGL 
S \D IRADHNT\ VLPGR\ PRQNSLHGSFHSADVLKMDD FGAVP / F 
LTELWQS I TPHQSQQS QPV\ELDPFGAAP FPS KQ 


6150 


372 


37 


MSN I KK Y 1 1 DYDWKAS I E I E IDHD VMTEEKLHQ I NNF WSD SE YR 
LNKHGSVLNAVLIMLAQHALLI AI S SDLNAYG WCEFDWNDGNG 
QEGWP PMDGS EG I RI TD I DTSG I F 


6151 


1555 


521 


DSNQQSVSGTAASTIiLHSFKATI Y YQGTGHVQQF YGVTS PYSQT 
TPP I VQSYAQPSLQYIQGQQI FTAHPQGVWQPAAAVTTI VAPG 
QPQ PLOPS EMWTNNLLDbPP PS PP KPKT I VLPPNW KTARDPEG 
KIYYYHVITRQTQWDPPTWESPGDDASLEHEAEMDLGTPTYDEN 

pmk\askkpktaeadtsselakkskevfrkemsqfivqclnpyr 

KPDCKVG\RITTTEDFKHIJ^KLTHGVMNKELKYCKNPE\DLEC 

nenvkhktkeyikkymqkfgavykpkedtefrvtvgpgwedgws 
g ktds r er ks cg p fcs tp v s tvllm i hhpg e fn pad vn 


6152 


1366 


648 


NRTWSTPSTWMGVALPPLCSTGPWPVTRQITARTTCGAVPAKCP 
PWC/DVHEPRCQPPDCHGHGTCVDGHCQCTGHFWRGPGCDELDC 
GPSNCSQHGLCTETGCRCDAGWTGSNCSEECPLGWHGPGCQRPC 
KCEHHCPCDPKTGNCS VS RVKQCLQP PEATLRAGELS FFTRTAW 
IALTLALAFLLLISTAANLS LLDSRAERNRRLHGDYAYHPLQEM 
NGEPLAAEKEQPGGAHNPFKD 


6153 


2 


3368 


GRVGARSPGRAYALLLLLI CFNVGSGbHLQVLSTRNENKLbP KH 
PHLVRQKRAWITAPVALLEGEDbSKKNPIAKIHSDLAEERGIiKI 
TYKYTG KG ITEP P FG I FVFN KDTGELNVTS I LDREETP F FLLTG 
YALDARGNNVEKPLELRIKVLDINDNEPVFTQDVFVGSVEEI>SA 
AHTLVMKINATDADEPNTLNSKISYRIVSLEPAYPPVFYLNKDT 
GEIYTTSVTLDREEHSSYTLTVEARDGNGEVTDKPVKQAQVQIR 
I LDVNDN I P WENKVLEGM VE ENQVNVE VTRIKVFDADE I G S DN 
WLANFT FASGNEGG YFH I ETDAQTNEG I VTL I KE VD YEEMKNLD 
FSVIVANKAAFHKSIRSKYKPTPIPIKVKVKNVKEGIHFKSSVI 
S I YVS ESMDRS SKGQ I IGNFQAFDEDTGLPAHAR YVKLEDRDNW 
ISVDSVTSEIKLAKIiPDFESRYVQNGTYTVKIVAISEDYPRKTl 
TGTVIjINVEDINDNCPTLIEPVQTICHDAEYVNVTAEDLDGHPN 
SGPFSFSVIDKPPGMAEKWKIARQESTSVLLQQSEKKLGRSEIQ 
FLISDNQGFSCPEKQVLTLTVCEVLHGS \ GCREAQHDSYVGIiGP 
AAIALM I LAFLLIJjLVPLLLIiMCH CGKGAKGFTP I PGTI EMLHP 
WNNEGAPPEDKWPS FljPVDQGGSLVGRNGVGGMAKEATMKGS S 
SASIVKGQHEMSEMDGRWEEHRSLLSGRATQFTGATGAI\MTTE 
TT I TARATGAS RDVAGAQAAAVAIiNEE FLKN Y FTDKAAS YTEED 
ENHTAKDCLLVYSQE ETESLNAS IGCCS F I EGELDDR FLDDLGL, 
KFKTLAE VCLGQKI D INKEI EQRQKPATETSMNTASHSLCEQTM 
VNSENTYSSGSSFPVPKSLQEANAEKVTQEIVTERSVSSRQAQK 
VATPLPDPMASRNVIATETSYVTGSTMPPTTVILGPSQPQSLIV 
TERVYAPASTLVDQPYANEGTVWTERVIQPHGGGSNPLEGTQH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment, containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl a 1 anine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQDVP YVMVRER2S FLAP SSGVQPTLAMPNI AVGQNVTVTSRVL 
APASTLQSSYQIPTENSMTARNTTVSGAGVPGPLPDFGLESSGH 
SNST I TTS STR VTKHS TVQHS YS 


6154 


3660 


2146 


KKKTKM KNTLQKTVN FGAW P KPT I SD KSHLLQM VSKLDLTDAKN 
SDTAHI KS 1EI TSILNGLQASESS AEDSEQEDERGAQDMDNNGK 
EESKI DHLTNNRNDL IS KEEQNSSSLIiEEKKVHADLVI S KPVSK 
S PERLRKD I E VLS EDTDYEEDEVT KKRKD VKKDTTDKSS KPQ I K 
RGKRRYCNTEECIiKTGSPGKKEEKAKNKESLCMENSSWSSSDED 
EEETKAKMTPTKKYNGLEEKRKSLRTTGFYSGFSEVAEKRIKLI* 
NNSDERLQNSRAKDRKDVWS S IQGQWPKKTLKELFSDSDTEAAA 
SPPHPAPEEGVAEBSLQTVABEESCS PSVELE KPPPVNVDSKPI 
EEKTVEVNDRKAEFPSSGSNFSA* I PLPYLHLNRLHQSL * QKGS 
RQQSSVTVSEPLAPNQEEVRSIKSETDSTIETOSVAGELQDLQS 
ERE* LASRF* CQCELEQ * * S ARTRTS * KSLYRS EKS ERC SG RRK 
FIKKAEKKP* SNSGKQQKEGK 


6155 


869 


121 


HLLPEIiRGKS W ITMKYVFYLGVLAGTFFFADS S VQKEDPAP YLV 
YLKSH FN PCVGVLI KPS WVLAPAHCYLPNLKVMLGNFKSRVRDG 
TEO/TINPIQlVRYWNYSHSAPQDDLMLIKLAKPAMIiNPKVQALN 
P\ PTTNVRPGTVCLLSGLDWSQENSGRHPDLRQNLEAPVMSDRE 
CQKTEQGKSHRNS LCVKFVKVFSRI FGEVAVATVI CKD KLQGI E 
VGHFMGGDVG I YTNVYKYVS W I ENTAKDK 


6156 


5725 


3984 


VMKKSETYAPLFCLPSFHKFCKGLIiADTLVEDyNICLQACSSLH 
ALS S S I>PDDLLQRCVDVCR VQLVHRGTCI RQAFGKLLKS I PLGV I 
FLSNNNHTEI QE I SLALRS HMS KAPSNTFHPQDFSD/ V I S FI LY j 
GNSHRTGKDI^IjERIiF YS CQRLDKRDQS TIPRNLLKTDAVLWQW » 
AIWFAAQF*TVI*SKLRTPI»GRAQDTFQTIEGI IRSLAGHTLNPDQ 
DVSQVnTADNDEGHGNNQLRLVLLLQYLENLEKLMYNAYEGCAN j 
ALTS PPKVIRT FL YTNRQTCQDWLTRIRLS IMRVGLLAG QPAVT 
VRHGFDLLTEMKTTSLSQGNELEVS IMMWEALCELHCPEA1QG 
IAVWSSSIVGKHLLWINSVAQQAEGRFEKASVEYQEHLCAMTGV 
DCCISSFDKSVLTLASAGCKSASLKHCLNGESRKSVLSKPTDSS 
PEVI NYLGNKACECYI STADWAAVQEWQNAIHDLKKSTS STSLN 
LKADFNYIKSLSS FESGKFVECTEQLELLPGENINIiLAGGSKE K 
IDMKKLLRNM 


6157 


946 


329 


MANRGPSYGLSREVQEKIEQKYDADLENKLVDWIILQCAEDIEH 
PPPGRAHFQKWLMDGTVLCK^INSLYPPGQEPIPKISESKMAFK 
QMEQI SQ FLKAAET YG VRTTD I FQTVDLWEGKDMAAVQRTLMAL 
GS VAVTKDDGCYRGEPSWFHRKAQQNRRG FSEEQLRQGQNVI GL 
QMGSNKGASQAGMTG YGM PRQ IM* DAAS CP 


6158 


441 


1482 


LGSLIVLSLHCKVIFSSQSLERAMKEKAVDLVPILAQNPGLAQN 
P I L EG KDHNQNTG VD P 1 1 DHVQ DRKTD / S RSKS PHKKRS KSRER 
RKSRSRSHSRDKRKDTREKI KEKERVKEKDREKEREREKEREKE 
KERGKNKDRDKEREKDREKDKEKDREREREKEHEKDRDKEKEKE 
QDKE KEREKDRS KE I DEKRKKDKKSRTP PRS YNASRRSRSSSRE 
RRRRRSRSSSRSPRTSKTIKRKSSRSPSPRSRNKKDKKREKERD 
HI SERRERERSTS MRKSSNDRDGKEKLE KNSTS LKEKEHN KEPD 
SS VSKEVDDKDAPRTEENKI QHNGNCQLNEENLSTKTEAV 


6159 


53 


84 


AVIAPLHISLGDRARPYLKNTEKSSTTCSRRRNQSFPPVMSLTH 
RLHLCKYWGCAVSNVCRFWEGRPLPLMIVVPYTLPVSLPVGSCV 
IITGTPILTFVKDPQLEVNFYTGMDBDSDIAFQFRLHFGHPAIM 
NSCVFG I WR YEE KC Y YLPFEDGKPFELC I Y VRH KE Y KVMVNGQR 
I YNFAHRFP PAS VKMLQVFRD I SLTRVLI SD*GRC VRI TAVQEF 
DVSVS CDCTTAYQPG 


6160 


1626 


1790 


AGAKFFP* F*KVADAQPTESE KE I YNQVNWLKDAEGILEDLQS 
YRGAGHEIREAIQHPADEKLQEKAWGAVVPLVGKLKKFYEFSQR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N~Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X= Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\spossible nucleotide insertion) 








LEAALRGLLGALTS T P YS PTQHL EREQALAKQFAE I LH FTLR FD 
ELKMTNPAIQNDFSYYRRTLSRMR1NNVPAEGBNEVNNELANRM 
SLFYAEATPMLKTLS DATTKFVSENKNLP I ENTTDCLSTMASVC 
RVMLETP E YRS RFTNEE TVS FCLRVMVGVI I L YDHVH P VG AFAK 
TSKIDMKGCIKVLKDQPPNSVEGLLtUUjRYTTKHLNDETTSKQI 
KS MLQ * QLLTLVNKG 


6161 


455 


1569 


PVSGSF^SIJlRAWASIbRLMIiGPRVAVSILCEDGISH*LI,EKH* 
KSHVLEPLSS LALEEQCLALSLDWSTGKTGRAGDQPLKI I SSDS 
TGQLHLLMVNETRPRLQKVASWQAHQFEAWIAAFNYWHPEIVYS 
GGDDGLLRGWDTR V PG KFLFTS KRHTMGVCS I QSSPHREHILAT 
GS YDSHI LLWDTRNMKQPLADTPVQGGVWRI KWHPFHHHLLLAA 
CMHSGFKI LNCQKAMEERQEATVLTSHT1»PDSLVYGADWS WLLF 
RSLQRAPSWSFPSNLGTKTADLKGASELPTPCHECREDNDGEGH 
ARPQSGMKPLTEGMRKNGTWLQATAATTRDCGVNPEEADSAFSL 
LATCSFYDHALHLWEWEGN 


~ 6162 


1 


586 


RTIHATGRAGAS PMHRL I WRLAEANKQHVRCQKCLEFGH WTYE 
CTGKRKYIiHRPSRTAELKKALKEKENRLiiLQQSIGETNVERKAK 
KKRSKSVTSSSSSSSDSSASDSSSESEETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*HQHR*QL*R*TTKEE 
EKE I ELLHS YWTDGLKTLM 


61S3 


1081 


785 


RI RS TTEGCAVRLHPTQNTGKARIMI LLSVS LGRHWAFTYKFFL 

TPVVFVFFFFFFHRKE*VMQKNPMKSREDEWMEKIjNNLHVQRAD 

MNRLI MNYLVTEG FKEAAE KFRMESG I EPSVDLETLDER I K I RE i 

MILKGQIQEAIADINSLHPEIiliDTNRYLYFHLQQQHLIELIRQR 

ETEAALEFAQTQLAEC^EESRECLTEMERTLALLAFDSPEESPF 

GDLIiHTMQRQKVWSEVNQAVI^YENRESTPKLAKLLKLLLWAQN 

ELDQKKVKYPKMTDLS KGVI EEPK 


6164 


90 


406 


PCQS PGRS RMRQDKLTGSLRRGGRCLKRQGGG VGT I LSNVLKKR 

scisrtaprllcti^pgvdtkiikftlepslgqngfqqwydaiika 
varlstg i pkewrrkvwltladhyx.hs iaidwdktmrft fwers 
npdddsmgiqivkdlhrtgcs sycgqeaeqdrvvlkrvllayar 
wnktvgycxxsfniiiaalilevmegnegdalkimiylidkvlpes 
yfvnnlralsvdmav frdllrmklpelsqhldtlqrtankesgg 
g yep pltnvfttmqwfltl fat clpnqtvlki wds vf fegse i il 
rvslai waklgeq i eccetade fystmgrltqemlendllqshe 
lmqtvysmapfpfpqlaelrekytyni tpfpatvkpts vsgrhs 
kardsdeendpddedavvnavgclgpfsgflapelqkyqkqike 
pneeqslrsnniaelspgainscrseyhaafnsmmmermttdin 

ALKRQYSRIKKKQQQQVHQVYIRADKGPVTSILPSQVNSSPVIW 
HLLLG KKMKMTNRAAKNAVIH I PGHTGGKI S P VP YEDLKTKLNS 
PWRTHIRVHKKNMPRTKSHPGCGDTVGLIDEQNEASKTN'GLGAA 
E A FPSGCTATAGREGSS PEGS TRRTI EGQ S PEPVFGDADVDVS A 
VQAKLGALELNQRDAAAETELRVHP P CQRHCPEP P S APEENKAT 
S KAPQGSNS KTP I FS P FPS VKPLRKSATARNLGLYG P TERTPT V 
HFPQMSRSFS KPGGGNSGP * KMVFSSGTMLSRQLPGYPQEYQRN 
GGERFG 


6165 


90 


406 


PCQSPGRSRMRQDKI.TGSLRRGGRCLKR0X5GGVGTILSNVLKKR - 
SCISRTAPRLLCTLEPGVDTKLKFTLEPSLGQNGFQQWYDALKA 
VARliSTGI PKEWRRKVWLTLADHYljHSIAIDXiDKTMRFTFNERS 
NPDDDSMG I Q I VKDLHRTGCS S YCGQEAEQDRVVLKRVLLAYAR 
WNXTVG YCQGFNILAAL I LEVMEGNEGDALK I MI YL IDKVLPES 
YFVNNLRALSVDMAVFRDLLRMKLPELSQHIJDTLQRTANKESGG 
GYEPPLTNVFTMQWFI»TIjFATCI»PNQTVLKIWDSVFFEGSEIIL 

rvslaiwaklgeqi eccetade fystmgrltqemlendllqshe 
lmqtvysmapfpfpqij^lrekytynitpfpatvkptsvsgrhs 
kardsdeendpddedawnavgclgpfsgflapelqkyqkqike 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c o rr e spondi ng 

to first 

sinino etc id 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

JtcbJLUUC »-»••. 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F~ Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /*=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PNEEQSIiRSNNIAELSPGAINSCRSEYHAAFNSMMMERMTTDIN 
ALKRQYSRlKKKQQQQVHQVYIRADKGPVTSIIiPSQVNSSPVIN 
HLLLGKKMKMTNRAAKNAVIH IPGHTGGKI SPV P YEDLKTKLNS 
PWRTHIRVHKKNMPRTKSHPGCGDTVGLIDEQNEASKTNGLGAA 
EAFPSGCTATAGREG SS P EGSTRRTI EGQS PEPVFGDADVDVS A 
VQAKLGAIiELNQRDAAAETELRVHP PCQRHCPE P PSAPEENKAT 
e xt~A Dnrtcisic v*?d t t?<5 pt?tj Q\/Tf Pt.P K' t ?&TRPMIiGl«YGPTE'RTPTV 
HFPQMSRSFSKPGGGNSGP*KMVFSSGTM3jSRQLPGYPQEYQRN 
GGERFG 


6166 


2 


1206 


HKLWRTVAMAGAEWKSLEECLEKHLPLPDLQEVKRVbYGKELRK 
LDI* PREAFEAASRED FELQG YAFE AAEEQLRRPR I VHVGLVQNR 
I PLP ANAP VAEQVSALHRRI KAI VE VAAMCGVN 1 1 CFQEAWTMP 
FAFCTREKLPVTI^FAESAEIXSPTnvFCQKIiAKNHDMVVVS PI LE 
RDSEHGDVtiWNTAWISNSGAVLGKTRKNHIPRVGDFNESTYYM 
EGNIiGHP VFQTQFGR I AVN I CYGRHHPLNWLiMYS INGAE 1 1 FNP 
SATIGAJuSESL»W PJ. CiAKNAAi ANMLr i \ — fiXlvr<.vvj i c>nr fvtar xo 
GDGKKAHQDFGYFYGSSYVAAPDSSRTPGLSRSRDGLLVAKLDIi 
NLCQQVNDVWNFKMTGRY EMYARELAEAVKSNYS PT I VKE * PAS 
VPALG 


6167 


1220 


1844 


YGIVTGPSLCAGDKQPKKQEKNPVBVSPEFVDEAIiCACEEYLSN 
LAH^IDKDLEAPIjYLTPEGWSLFIiQRYYQVVHEGAFXRHIiDTO 
VQRCEDI LQQLGAWPQ I DMEGDRNI W I VKPGAKSRGRG IMCMD 
HLEEMI*KLVNGKP VVMKDG KWWQKY I ERPLLI FGTKFDLRQWF 
LVTDWNPI J TVWFYRDSYIRFSTQPFSLKNLDK*API>YX.TPEGWS 
LFLQRYYQWHEGAELRHLDTQVQRCEDILQQLQAWPQIDMEG 
DRNIW IVKPGAKSRGKC* J.Mt.r^HLdaJiMJuJ^ v 
VQKYI ERPLLI FGTKFDIjRQWFLVTDWN PLTVWFYRDSY I RFST 
QPFSLKNLDK 


6168 


64 


1392 


VWPVPS VSAMPPKKQAQAGGSKKAEQKKKEKI I EDKTFGLKNKK 
GAKQQKFIKAVTHQVKFGQQNPRQVAQSEAEKKLKKDDKKKELQ 

K I K i ■ h I\_Jr V v AAyK.1 i>x\.L?M_L/t^ivo V v V^e\Sr c Iv^Vjjy *— j. xv-Cl^zvk-ivx on. 

DLTLERKCEKRSVYIDARDEELEKDTMDNWDEKKLEEVVNKKHG 
EAEKKKPKTQI VCKHFLEAI ENNKYGWFWVCPGGGD I CM YRHAI* 
P PGFVLKKKKKKKKKEDE I S L* DLI ERERSALG PNVTKITLES F 
LAWKKRKRQEKIDKLEQDMERRKADFKAGKALVISGREVFEFRP 
t?t A/wnnTYFU'zvrvnTR ytogtggdevddsvsvndidls LYIPRDVD 
ETG I TV AS LERFS TYTSDKDENKLS EAS GGRAENGERS DLEEDN 
EREGTENGAIDAVPVDENIiFTGEDIJDELEEELN'riiDLEE 


6169 


112 


662 


APAAAMAERPEDLNLPNAVITRIIKEALPDGVNISKEARSAISR 
AASVFVLYATSCANNFAMKG KRKTLNAS DVLSAMEEME FQRF VT 
PliKEALEAYRRBQKGKKEAS EQKKKDKDKKTDS EEQDKSRDEDN 
DE DE BRLEE EEQNE EEEVDN * KGRETVAPWKVPLEMRRATCFCE 
AFPCWAE 


6170 


62 


667 


STKVMLPNTGRI^GCTVFlTGASRGIGKAIAIiKAAKDGANIVIA 
AKTAQPHPKLLGTI YTAAEE IEAVGGKALPCI VDVRDEQQISAA 
VEKAI KKFGGID I LVNNAS AI SLTNTLDT PT KRLDLMMNVNTRG 
TYLASKACI PYLKKSKVAHI PNISPPLNLNPVWFKQHCGRW* W 
G*GDGLCLICFBLNLCMSDVITICT 


6171 


382 


941 


HFMQS DVELDCD I E P CGHTKFPPTI*PLS TTVI VCS CHPVATAST 
MAEAFSKTTSEEDQS IQE PKEANSMTAQKQKK* GIjRGSRRRHAN 
SGGDI FGDS FAAYF PRVLKQVHQALSLSQEAVSVMD S MVRD I LD 
RIATEAGHLAHYSKCVTITSRDIRMAVCLIiLPGKMGKLAESQGT 
NATLRYTKSK 


6172 


651 


54 


" GLCRAGGAHRFSRTHVEAALKMIjRREARIjRREYLYRKAREEAQR 
SAQERKERLRRALEENRLIPTELRREALALQGSLEFDDAGGEGV 
TSHVDDEYRWAGVEDPKVMITTSRDPSSRLKMFAKELKLVFPGA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine / R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
^possible nucleotide insertion) 








QRMNRGRHEVGALVRACKANGVTDLLWHEHRGTPVGLIVSHI>P "' 
FGPTAYFTLCNWMRHDI PDLGTMSEAKPHL ITHGFS SRLGKRV 
S D I LR YL F P V PKDDS HRV I T FANQDD Y I S FRHH V YKKTDHRNVE 
LTEVGPRFELKLYMIRLGTLEQEATADVEWRWHPYTNTARKRVF 
LSTE+AAPRPLGQLb j 


6173 


3 


288 


S VDHRE VQVLS OS M PI/T PH OA VLRGE RP Y MPV pre; k r Pr*» q cut ' 
IjQHQRIHTGEKPYVCSVCGKAFSQSSVLSKHRTIHTGEKPYECN 
ECGKAFRVSSDLAQHHKIHTGEKPHECLECRKAFTQLSHLIQHQ 
R I HTGERPYVCPLCG KAFNHSTVLRSHQRVHTGEKPHRCNECGK 
TFS VKRTLLQHQR I HTGEKP YTCS ECG KAFS DR S VL I QHHNVHT 
GEKPYECSECGKTFSHRSTLMNHERIHTEEKPYACYECGKAFVQ 
HSHL IQHQKVHRKL * PTCVLS VGSAIjAGVPTS FS I S VSTLERS P 
MCAVYVGR PSARAQSLVNTGQFTQVRS PMS VMS VEKPLE 


6174 


1060 


j 959 


PRPPGKRWMVAGIiGNPGLPGTRHSVGMAVLGQLARRLGVAESWT 
RDRHCAADIAJjAPLGDAQLVKLRPRRIiMNANGRSVARAAELFGL 
TAEEVYLVHDELDKPLGRLALKLGGSARGHNGVRSCISCLNSNA 
MPRLRVGIGRPAHPEAVQAHVLGCFSPAEQELLPLLLDRATDLI 
LDHIRERSQGPSLGP *H* WFSKKA 


6175 


2204 


334 


RYFRADPRSRSGQPRAEGLGAFAEGPIiRAMAAPVKGNRKQSTEG 
DALDPPASPKPAGKQNGIQNP I SLEDS PEAGGEREEEQEREEEQ 
«r xjvdlii i\.r i*i iVCKrl if J, £.K v PHL»G F KQ INliWKI Y KAVEKIX3A Y E 
LVTGRRLWKNVYNELGGS PGSTSGATCTRRH Y * RLVLPYVRHUC 
GEDDKPLPTSKPRKQYKMAKENRGDDGATERPKKAKEERRMDQM 
MPGKTKADAADPAPLPSQEPPRNSTEQQGLASGSSVSFVGASGC 
PEAYKRLLS S F YCKGTHG I MS PLAKKKLLAQVS KVEALQCQEEG 
CRHGAEP0ASPAVHLiPESP>Oc?PK , f:T.TPMQT?HT?T tdoppt riarw 

SLREEAQAGPCPAAP I FKGCFYTH PTEVLKPVSQHPRDFFSRLK 
DGVLLG P PGKEGLS VKE PQLVWGGDANRPS AFH KGGSRKG I I*YP 
KPKACWVS PMAKVPAES PTLPPTFPSS PGLGSKRSLEEEGAAHS 
GKRIiRAVSPFLiKEADAKKCGAK'PAC? < 5GT J V l ?r , T .T .a PAT nm/D dt? a 
YRGTMLHCPLN FTGTPG PLKGQAALPFS PLVI P AFP AH FLATAG 
PS PMAAGLMHFP PTS FDS ALRHRLCP ASSAWHAP PVTTYAAPHF 
FHLNTKX, 


6176 


1040 


402 


PLSALRAMAE VH VI GQ I IGASGFS ESS LFCKWG I HTGAAW KIiI*S 
GVREGQTQVDTPQIGDMAYWS HP I DLHFATKGLQGW PRLH FQVW 
SQDS FGRCQIAG YG FCHVPSS PGTHQLACPTWRPLGSWREQLAR 
AFVGGG PQLLHGDTIYSGADRYRLHTAAGGTVHLEIGLL.LRNFD 
RYGVEC*GTIiPPTSPPSTPRTPSDGGGWHSGQEHRL 


6177 


1400 


992 


VPIESLVGKVHNFPLIAFYCCEKGKRQPHKSLHDRCFGEALDPN 
CSHCYLDQI KRSDFLGFSGYS PHFVAI STNSEHKMQPSSMQQAIi 
PSQ+PYWTDPRPALVPCCSHRPDVHRSRPGPGLPGTSGCSDRPP 
VCPI 


6178 


1027 


254 


STQRGGI KG VARAAS LVGRRRAGTGMALLLCLVCLTAALAHGCD 
HCHSNFSKKFS F YRHHVNFKS WWVGDI PVSGALLTDWSD0TMKE 
LHIiA I PAKI TREKLDQVATAV YQMMDQL YQGKM Y FPG YFPNELR 
NIFREQVHLIQNAIIESRIDCQHRCGIFQYETISCNNCTDSHVA 
CFGYNCESSAQWKSAVQGI»LNYINNWHKQDTSMRPRSSAFSWPG 
THPAAPAFLVLPALRCLEPPHLANLSLEDAA* CLKQH 


6179 


806 


276 


RGETREMAGNl^LSGAGRRIiWDWVPIiACRSFSljGVPRLIGIRLTl, 
PPPKVVDRWNEKRAMFGVYDNIGILGNFEKHPKELIRGPIWLRG 
WKGNELQRCIPJCRKMVGSRMFADDLHNLNKRIRYLYKHFNRHGK 
FR* KRKLRTSEKAHLS PWRRETVLFPVRKRLCI FS VI KWGFFGI 


6180 


156 


1833 


DHHIIiKAASTTHVCARGNlFAIPNTRCLEC*ATATPSSLECQN* 
S HLSLCP LPATTSGIiTPNS MI P EKERQNIAERLLRVM CADLGAL 
S WSG KE FI» KLAQTIjVBSGAR YGAFS VTEILGNFx^LAJjKHIjPR 
MYNQVKVKVTCAIiGSNACLG IGVTCHSQSVGPDSCYIlYrAYQAE 
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SSQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=*Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=>Stop 
Codon, /^possible nucleotide deletion, 
\=poasible nucleotide insertion) 








"GNHIKSYVLGVKGA^IRDSGDLVHHWVONVIjSEFVMSEIRTVYV 
TD CRVSTSAFS KAGMCLRCSACALNS WQS VLS KRTLQARSMHE 
VIELLNVCEDLAGSTGLAKETFGSLEETSPPPCWNSVTDSLLLV 
HERYEQI CEFYSRAKKMNLI QSIjNKHLLSNLAAI LTPVKQAVI E 
LSNESQPTLQLVLPTYVRLEKLFTAKANDAGTVSKLCHLFLEAL 
KENFKVHPAHKVAMILDPQQKLRPVPPYQHEEIIGKVCELINEV 
KESWAEBADFEPAAKKPRSAAVENPAAQEDDRDGKNEVYDYLQE 
P LFQATPDLFQ YWS CVTQKHTKIiAKtiAFW LIiAV P AVGARSGCVN 
MCEQALLI KRRRLLS PEDMNKLMFLKSNML 


6181 


169 


1032 


TRTLLSPVLLPGPRWKPWRRRPMGPLALPAWLQPRYRKNAYLFI 
YYLIQFCGHSW I FTNMTVRFFS FGKDSMVDTF YAIGLVMRLCQS 
VSLLELLH I YVGI ESNHLLPRFLQLTERI I IIiFWITSQEEVQE 
KYVVCVI*FVFWNIjLDMVRYTYS MLS VIGI S YAVLTWLSQTLWMP 
I YP LCVLAEAFAIYQS LP YFES FGTYSTKLPFDLS I YFPYVLKI 
YLMMLFIGMYFTYSHLYSERRDILGIFPIKKKKM*STAFQCDTR 
KDRLWIQCSK*NTGS ILVEKFLVF 


6182 


1769 


1224 


AS*IDYQLNTLLKEFQLTEENTKLRYLTCSLIEDMAAAYFPDCI 
VRPFGSSVNTFGKLGCDLDMFLDLDETRNLSAHKISGNFLMEFQ 
VKNVPSERIATQKIIiSVLGECIjDIIFGPGCVGVQKILNARCPLVR 
FSHQASG FQCDLTTNNR I ALTSS ELL Y X YGALDS RVRALVFSVR 
CWARAHSLTSS I PGAW ITNFSLTMMVIFFLORRSPP ILPTLDSL 
KTLADAEDKCVIEGNNCTFVRDLSRIKPSQNTETLELLLKEFFE 
YFGNFAFDKNS INI RQGREQNKPDSS PLY 1 QNPFETSLN I S KNV 
SQSQLQKFVDIjARESAWILQQEDTDRPSISSNRPWGLVSLLLPS 
APNRKSFTKKKSNKFAIETVKNLLESLKGNRTENFTKTSGKRTI 
STQT 


6183 


1118 


452 


HLDRY I KS PGSGSSTPAPPSHLLLYLLHPQSTRTMGCCGCSRGC 
GSGCX3GCGSSCGGCGSGCGGCGSGRGGCGSGCGGCSSSCGGCGS 
RCYVPVCCCKPVCSWVPACSCTSCGSCGGSKGGCGSCGGSKGGC 
GSCGCSQSSCCKPCCCSSGCGSSCCQSSCCKPCCCQSSCCVPVC 
CQSSCCKPCCCQSNCCVPVCCQCKI*GSGPRPSGFSCLVKAFLM 
VP 


6184 


1 


2191 


IVTVREEDGAPAVAPPGVWSRANKRSGAGPGGSGGGGARGAEE 
EPPPPLQAVLVADSFDRRFFPISKDQPRVLLPLANVALIDYTLE 
FLTATGVQETFVFCCWKAAQIKEHLLKSKWCRPTSLNVVRI ITS 
ELYRSLGDVLRDVDAKALVRSDFLLVYGDVISNINrTRALEEHR 
LRRKL* KNVSVMTMI FKESS PSHPTRCHEDNVWAVDSTTNRVL 
HFQKTQGLRRFAFPLSLFQGSSDGVEVRYDKLDCHISICSPQVA 
' QLFTDN FDYQTRDDFVRGLLVNEE I LGNQIHMHVTAKE YGARVS 
NLHWYSAVCADVIRRWVYPLTPEANFTDSTTQSCTHSRHNIYRG 
PEVSLGHGSILEENVLLGSGTVIGSNCFITNSVIGPGCHIEPGD 
NWLDQTYLWQG VRVAAGAQ I HQSLLCDNAEVKERVTLKPR S VL 
TSQVWGPNITLPEGS VI S LH PPDAEEDEDDGEFS DDSGADQEK 
DKVKMKGYNPAEVGAAGKGYLWKAAGMNMEEEEELQQNLWGLKI 
NMEEESESESEQSMDSEEPDSRGGSPQMDDIKVFQNEVLGTLQR 
GKEENISCDNLVLEINSLKYAYNISLKEVMQVLSHWLEFPLQQ 
MDSPLDSSRYCALLLPLLKAWS PVFRNYIKRAADHLEALAAIED 
F PLEH E ALG I S MAKVL.MAF YQLE I LAEET I LS WFS QRDTTD KG Q 
QLRKNQQLQRFIQWLKEAEEESSEDD 


6185 


791 


44 


PCTS CVLWATLHL PASTRKAPQAECGM I S I TEWQKI GVG IT G FG 
I FFILFGTLLYFDS VLLAFGNLLFLTGLSLI IGLRKTFWFFFQR 
HKLKGTSFLLGGWIVLLRWPLLGMFLETYGFFSLFKGFFPVAF 
GFLGNVCNI PFLGALFRRLQGTSSMV* KTEMSSLNLDHWLKGAK 
REEWEPPPQS PALTHS PTYPGPPQVQKERNGAEQLTSNPQVDSR 
GCQEAEMQTPRRLGWGWYHTLTLYLWEEK 


6186 


569 


238 


VYGIDSSNTNTHGAEERNRKLKKHWKLCHAQSRLDVNGLALKMA 



462 



BNSDOCID: <WO 01S3312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acxd segment containing signal peptide 
(A~Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=bysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine , R=Arginine, 
SeSerine, T=Threonine, V=Valine, 
Ws =Tryptophan, Y= Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\*=possible nucleotide insertion) 








KERKVKNKVKNKADTEEVFNNSPTNQEKMPTSAILPDFSGSVIS"" 
NIRNQMETLHSQPHQEENLCFENSFSLINLLPINAVEPTSSQQI 
PNRETSEANKERRKMTSKSSESNiySPLTSFrTADSELHDIIKD 
liEDCLMVG LHTCG Dl» APNTLR I FTSNS E I KGVCS VGCCYHLLSE 
EFENQHKERTQEKWGFPMCHYLKEERWCCGRNARMSACLALERV 
AAGQGLPTESLFYRAVLQD 1 1 KDCYG I TKCDRHVGKI YS KCS S F 
LDYVRRSLKKLGLDESKLPEKI IMNYYEKYKPRMNELEAFNMLK 
WLAPCIETLILLDRI»CYUCEQEDIAWSALVKLFDPVKSPRCYA 
VI ALKKQQ+ FPLKQI IRCISL* DSAGCAE EVSVGDGGPAIiRDAP 
PSGSRVGSRYD 


6187 


1701 


771 


DAWGPETRLARILNPDS FIE PRPGRLPELEATRPHMEPKASCPA"" 
AAPI«MERK FHVLVGVTGS VAALKLPLLVS KLLDI PGI/E VAWTT 
ERAKHFYS PQDI PVTLYSDADE WEMWKSRSDPVLH I DLRRWADL 
LLVAPU>ANTLGKVASGICDNIiIiTCVMRAVJDRSKPI,I>FCPAMNT 
AMWEHP ITAQQVDQLKAFGYVE IPCVAKKI,VCGDEGLGAMAEVG 
TlVDKVKEVLFQHSGFQQS*PGISVMGVPIiYSEWVQAKSVKMDV 
GK I GGYPHLIiNGG PALS L PRGQACSRLNWTEGPGIiS F FQPGEAA 
A 


6188 


23 8 


1534 


KG FVMAGP LMAELQ VS PQWKAP EMS Ql CLS CGHPSA* GPRW AS W 
NIGVPICIRCAGIHP^I^VHISRVKSVNLIKJWTQEQIQCMQEMG 
NGKANRJLYEAYLPETFRRPQIDPAVEGFIRDKYEKKKYMDRSLD 
INAFRKEKDDKWKRGSEPVPEKKLEPWFEKVKMPQKKEDPQLP 
RKSSPKS TAP VMDLLGLDAP VACS IANSKTSNTLEKDLDLLASV 
PS PSSSGSRKWGSMPTAGSAGSVPSNLNLFPEPGSKS EE I GKK 
QLSKDSILSLYGSQTPQMPTQAMFMAPAQMAYPTAYPSFPGVTP 
PNS I MGSMMPPPVGMVAQPGASGMVAPMAMPAGYMGGMOASMMG 
VPNGMMTTQQAGY^GMAAMPQTVYGVQPAQQI^WNLTQMTQQM 
AGMNFYGANGMMNYGQSMSGGNEQAANQTLSPQMWK 


6189 


1297 


793 


LGE PLiGDLCEliI PGDVQQLQMGE VHPGTGAQGSAAQSVAGEVQL 

TQLSHARQR PS CQGS QI>IALDI/QHMDI SRQPRWQHVQP VARQVQ 

RACK3AQI^GVAVHLWAGDAVVAEVELLQEVGGGKVFAANACDL 

WQDHEGAHAARQATGHALQRVIVQVRRVQPLEAL*RVPSGLPR 

RVRAFMI LHNQ I TG IGRED FATTY FLEELNLS YNRITS PQVHRD 

AFRKIjRLLRSLDI^GNRLHMLPPGLPRNVHVLKVKRNELAA^ 

GALAGMAQLRELYIjTSNRIjRSRAI/SPRAWVDIJVHLQLLDIAGNQ 

LTE I PEGLPES LE YLYLQNN KI S AVPANAFDS TPNXiKG I FLRFN 

KLAVGSVVDSAFRRLKHLQVIiDIEGNLEFGDISKDRGRLGKEKE 

EEEEDEVEEEETR 


6190 


66 


1309 


ilvgnvsflls faeyvcncs wgslwvnrcnqttgqcecrpg yq 
glhcetckegf y^n ytsglcqpcdcs phgals i pcnssgkcqck 
vgvigsicdrcqdgyygfskngclpcqcnnrsascdaltgacln 
cqens kgnhceeckegfyqspdatkeclrcpcsavtstgscs i k 
ssele pecdqckdgyigpncnkcengyynfds 1 crkcq chghvy 

P VKT P KICKPESG E CINCLHNTTG FWCENCL * G YVHDLEGNCI K 

kvilptpegstilvsnaslttsvptpvinstftpttlqtifsvs 

TSENS TSALADVS WTQFN 1 1 ILTVI I IVWIit,MGFVGAVYMYRE 
YONRKLNAPFWTI ELKEDNISFSS YHDS I PNADVSGLLEDDGNE 
VAPNGQLiTLTTP IHNYKA 


6191 ■ 


1212 


1511 


VNL CHGGLXiHLS THHLG I KPSMH * L FFLMLS FPHLT PQQ P KCPS 
MIDW I KKI WY I YTME YYATI KRNE IMFFAGTWMEMEAI ILSKLM 
QDYMFSLISGS 


6192 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEAGIEAVGSAAEE"" 
KGGIiVS DAYGEDDFS RIiGGDEDG YEEE EDENSRQSEDDDS ETEK 
PEADDPKDNTEAE KRDPQELVAS FSERVRNMS PDE I KI P PEP PG 
RCSNHLQDKIQKLYERXI KEGMDMNYI IQRKKEFRNPS 1 YEKLI 
QFCAI DELGTNYPKDMFDPHGWSEDS YYEALAKAQK I EMDKLEK 
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SSQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firBt 
amino acid 
residue of 
amino acid 
sequence 


predicted end 

nucleotide 

location 

c or r e spond ing 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A- Alanine, OCysteine, D*=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L= Leuc ine , K=Me thionine , N=Asparagine , 
P= Proline, Q=GIut amine, R^Arginine, 
5=Serine, T=Threonine, V= Valine, 
W= Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AK KE RTKI E F VTGTKKGTTTN ATSTTTTTAS TAVADAQKRKS K W 
DSAI P VTTI AQ PT I LTTTATLP A WTVTTS ASG S KTTVI SAVGT 
IVKKAKQ 


6193 


3 


950 


TRGCGN KMAG KKNVLS S LAVYAEDS SPES DG E AG IEAVGS AAE E 
KGGLVSDAYGEDDFSRLGGDEDGYE3EEDENSRQSEDDDS3TEK 
PEADDPKDNTEAEKRDPQELVASFSERVRKMSPDEIKIPPEPPG 
RCSNHLQDKI QKLYERKI KEGMDMNYI IQRKKEFRNPS I YEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKIiEK 
AKKERTKI E F VTGTKKGTTTNATS TTTTTASTAVADAQKRKS KW 
DS A I ? VTT I AQPTILTTTATLPAWTVTT SASGS KTTVI SAVGT 
IVKKAKQ 


6194 


3 


950 


TRGCGNKMAG KKNVLS SLAVYAEDSEPESDGE AG I EAVGSAAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEABKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
RCS NHLQDKI QKLYERKI KEGMDMN Y I IQRKKE FRNPS I YE KI»I 
QFCAI DELGTNYPKDMFDPHGWSEDS Y YEALAKAQKI EMDKLEK 
AKKERTKI EFVTGTKKGTTTNATS TTTTTASTAVADAQKRKS KW 
DSAI PVTTIAQPTILTTTATLPAVVTVTTS ASGS KTTVI SAVGT 
IVKKAKQ 


6195 


736 


235 


VANGLQSNMPKFYCDYCDTYLTHDSPSVRKTHCSGRKHKENVKD 
YYQKWMEEQAQSLIDKTTAAFQQGKI PPTPFS APPPAGAMI PPP 
PSL PG P PRPGMMPAPHMGG P PMMPMMGPPPPGMMPVGPAPGMRP 
PMGGHMPMMPGPPMMRPPARPMMVPTRPGMTRPDR 


6196 


1512 


623 


KTGKRRS AAY VRN I LDNAEQVI SNLEARNL3PRLTP LLQEEDSH 
QRLDMGLMVS ELKDHFLRHLQGVEKKKIEQMVLDYI SKLLDLI C 
HIVETNWRKHNLHSWVLHFNSRGSAAEFAVFHIMTRILEATNSL 
FLP LPPGFHTLHTI LGVQCLPLHNLLHC I DSG VLLLTETAVI RI* 
MKDLDNTEKNEKLKFS 1 1 VRLPPLIGQKICRLWDHPMSSNI ISR 
NHVTRLIjQNYKKQPRNSMINKSSFSVEFLPLNYFIEILTDIESS 
NQALYP FEGHDNVDAEFVEEAALKHTAMLLGL 


6197 


3 


819 


ADPEGTESAVMSRYTRPPNTSLFIRNVADATRPEDLRREFGRYG 
PIVDVYIPLDFYTRRPRGFAYVQFEDVRDAEDALYNLNRKWVCG 
RQIE I QFAQGDRKTPGQMKSKERHPCS PSDHRRSRS PSQRRTRS 
RS S S WGRNRRRSDSLKESRHRRFS YS QS KSRS KSLPRRSTS ARQ 
SRTPRRN7GSRGRSRSKSLQKRSKS IGKSQSSS PQKQTSSGTKS 
RSHGRHSDS IARS PCKSPKGYTNFETKVQTAKHSHFRSHSRSRS 
YRHKNSW 


6198 


111 


1912 


S EAALS PS FI SPACFLLRKLPALEDGTLPHPDTLGMN YEGARSE 
RENHAADDSFA5GALDMCCSERIiPGLPQPIVMEALDEAEGLiQDSQ 
REMPPPPP PSPPSDPAQKPPPRGAGSHSLTVRS SLCLFAASQFI, 
LACGVLWFSG YGH I WSQNATNLVSS LLTLLKQLEPTAWLDSGTW 
GVPSLLLVFLSG^L^VTTLVWHLLRTPPEPPTPLPPEDRRQSV 
SRQ PS FT YS EWME EKI EDDFLDLDP VPETP VFDCVMD I KP BAD P 
TSIiTVKSMGLQERRGSNVSLTLDMCTPGCNEEGFGYLMSPREES 
AREYLLSASRVLQAEELHEKALDPFLIjQAEFFEIPMNFVDPKEY 
D I PGLVRKNR YKT I LPNPHSRVCLTS PDPDDPLSSY INANYIRG 
YGGEEKVYI ATQGPI VSTVADFWRMVWQEHTP 1 1 VKITNI EEMN 
EKCTEYWPEEQVAYDGVEITVQKVIHTEDYRLRLISLKSGTEER 
GLKHYWFTSWPDQKTPDRAPPLLHLVREVEEAAQQEGPHCAPII 
VHCSAG IGRTGCF IATS I CCQQLRQEG WD ILKTTCQLRQDRGG 
MIQHCEQYQFVHHVMSLYEKQLSHQS PE 


6199 


144 


1211 


MARENGESSSS WKKQAEDI KKI FEFKETLGTGAFSEWLAEEKA 
TGKLFAVKCI ? KKALKGKESS I ENE I AVLRKI KHENIVALEDI Y 
BS PNHLYLVMQLVSGGELFDR I VEKGFYTEKDASTLJCRQVLDAV 
Y YLHRMG I VHRDLKP ENLLYYS QDEES KIMI SDFGLSKMEGKGD 
VMS TACGTPG YVAPE VLAQKPYS KAVDCWS I GVI AYILLCG YP P 
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SEQ 
ID 
NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide -- 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I~Isoleucine , K= Lysine, 
L=Leucine, M^Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=poseible nucleotide insertion) 








FYDENDSKLFEQILKAEYEFDS PYWDDISDSAKDFIRNLMEKDP ' 

NKRYTCEQAARHPWIAGDTALNKNIHESVSAQIRKNFAKSKWRQ 

AFNATAVVRHMRKLHIXSSSLDSSNASVSSSLSLASQKDCASG^F 
HAL* 


6200 


702 


96 


LPEVPHSLRPRVKPHLCCAQPAVRVMARLPKLAVFDLDYTLWPF 
WVDTHVDP P FHKS SDGTVRDRRGQDVRLYP EVPEVLKRLQS LG V 
PGAAAS RTS E I EGANQLLELFDLFR YFVHRE I YPGS K I THFERL 
QQKTGI PFSQMI FFDDERRNI VDVSKLGVTCIHIQNGMNLQTLS 
QGLETFAKAQTGPLRSSLEES PFE A 


6201 


2809 


2383 


GQT PR VR W KMRRS LRAGKRRQTAGRKS KS P P KVP I V I QDDS L PA 
GPPPQIRILKRPTSNGWSSPNSTSRPTLPVKSLAQREAEYAEA 

RKRILGSASPEEEQEKPILDRPTRISQPEDSRQPNNVIRQPLGP 
DGSQGFKQRR 


6202 


2 


426 


INADRAAVASSLLSRPTRKMAPOKDRKPKRSTWRFNLDLTHPVE 
DG I FDSGNFEQFLREKVKVKGKTGNLGNWHI F.RFKNKITWSE 
KQ FS KRYLKYLTKK YLKKNNLRD WLRWASDKET YELR YFQ ISO 
DEDESESED 


6203 


419 


2550 


RCPRPPATAGAAASRPDRSPPSGISGSEAAAGAGAAAPASQHPA 
TGTGAVQTEAMKQ I LG VI DKKLRNLEKKKGKLDDYQERMNKGER 
LNQDQLDAVS KYQEVTNNLEFAKELQRSFMALSQDIQKTI KKTA 
RREQLMREEAEQKRLKTVLEI^YVI^ia>GDDEVRTDLKQGLNGV 
P ILSEEELSLLDEFYKLVDPERDMSLRLNBQYEHAS I HLWDLLE 
GKEKPVCGTTYKVLKEIVERVFQSNYFDSTHNHQNGLCEESEAA 
SAPAVEDQVPEAE PE PAE E YTEQS E VESTE YVNRQFMAETQ FTS 
GEKEQVBEWTVETVE VVNS LQQQPQAAS PS VPE PHS LT P VAQAD 
PLVRRQRVQDLMAQMQGPYNFIQDSMLDFENQTLDPAIVSAQPM 
NPTQNMDMPQLVCPPVHSESRLAQPNQVPVQPEATQVPLVSSTS 
EG YTASQPLYQPSHATEQRPQKEP I DQIQATISLNTDQTTAS SS 
LPAASQPQVFQAGTSKPLHSSGINVNAAPFQSMQTVFKMNAPVP 
PVNEPETLKQQNQYQASYNQSFSSQPHQVEQTELQQEQLQTWG 
TYHGS PDQSHQVTGNHQQPPQQNTG FPRSNQP YYBTSRGVSRGGS 
RGARGLMNGYRGPANGFRGGYDGYRPSFSNTPNSGYTQSQFSAP 
RDYS GYQRDG YQQNFKRGSGQSGPRGA PRGRGGPPRPNRGM PQM 
NTQQVN 


6204 


2933 


787 


CTHNLI SLLGGRALIHFNRFLNLKI QEGEAHNI FCPAYDCFQLV 
PGDI IKSWSKEMDKRYLQFDIKAFVENNPAI KWCPTPGCDRAV 
RLTKQGSNTSGSDTLSFPLLRAPAVDCGKGHLFCWECLGEAHEP 
CDCQTWKNWLQKI TEMKPEELVG VS EAY EDAAN CLWLLTNSKPC 
ANCKSPIQKKEGCNHMQCAKCKYDFCWICLEEWKKHSFVHWEVI 
YRCTRYBVIQHVEEQSKEMTVEAEKKHKRFQELDRFMHYYTRFK 
NHEHS YQLEQRLLKTAKBKMEQLSRALKETEGG CPDTTF I EDA V 
HVLLKTRRI LKCS YP YGPFLE PKSTKKE I FELMQTDLEMVTEDL 
AQKVNRP YLRTPRHKI IKAACLVQQKRQEFLAS VARGVAPADS? 
EAPRRS FAGGTWDWKYr/5P2x ^ PT7irvE.ir K*rvv^PT>«Dr»t>r> or'T^TTur- 

LLSNPPDPDEPSESTLDIPEGGSSSRRPGTSWSSASMSVLHSS 
S LRDYT PAS RS ENQDSLQALS S LDEDD PN I LLAI QLSLQES GLA 
LDEETRDFUSNEASL43AIGTSLPSRLDSVPRNTDSPRAALSSSE 
LLELG DS LMRLGAENDPFSTDTLS SKPLS EARSD FCPS SSDPDS 
AGQDPN INDNLLGNI MAWFHDMNPQS IALI PPATTEI SADSQLP 
CI ECDGSEGVKDVELVLPEDSMFEDAS VSEGRGTQ1 EENPLEENI 
PGGGKQHPQAW 


6205 


1 


1200 


RAHRGKMALEVGDMEDGQLSDSDSDKTVAPSDRPLQLPKVLGGD - 
S AMRAFQNTATACAP VSHYRAVES VDSS EES FSDSDDDS CLWKR 
KRQKC FNP P PKPEPFQFGQS SQKPPVAGGKKINNI WGAVLQEQN 
QDAVATELG I LGMEGT I DRS RQSETYNYLLAKKLRKESQEHTKD 
LDKELDEYMHGGKKMGSKEEBNGQGHLKRKRPVKDRLGNRPEMN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H^Histidine, I=Isoleucine , K=Lysine, 
L=*Leucine, M=Methionine, N=Asparagine, 
psProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyroeine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=pcssible nucleotide insertion) 








YKGRYE ITAEDSQEKVADE IS FRLQEPKKDLI AR WRI IGNKKA 
I3LLMETAEVEQNGGLFIMNGSRRRTPGGVFLNLLKNTPSISEE 
QI KDI FYIENQKEYENKKAARKRRTQVLGKKMKQAIKSLNFQED 
DDTSRETFASDTNEALASLDE SQEGHAEAKLEAEEAI EVDHSHD 
I»DIF 


6206 


10 


1442 


1 1 SERRERSCIiHLVCIRCSCDWEMGSVLGLCSMASW IPCLCGS 
APCIiCRCCPSGNNSTVTRLIYALFLLVGVCVACVMLIPGMEEQ 
LNKIPGFCENEKGWPCNILVGYKAVYRLCFGLAMPYLUbSLLM 
I KVKSSSDPRAAVHNGFWFFKFAAAIAI I IGAFFIPEGTFTTVW 
FWGMAGAFCFILIQLVLLIDFAHSWNESWVEKMEEGNSRCWYA 
ALLSATALNYI^LVAIVLFFVYYTHPASCSENKAFISVNMLLC 
VGASVMS ILPK JQESQPRSGLLQSSVI TVYTMYLTWSAMTNEPE 
TNCNPSLLS I IGYNTTSTVPKEGQSVQWWHAQG I IGLILFLLCV 
FYSSIRTSNNSQVNKLTLTSDESTblEDGGARSDGSLEDGDDVH 
RAVDNERDGVTYSYSFFHFMIjFIiASIjYIMMTIjTNWYRYEPSREM 
KSQWTAVWVKISSS W I GI V LYWTLVAP LVLTNRDFD 


6207 


2924 


1471 


cvmaeaatpgttattsgagaaaataaaasptpi ptvtapslgag 
gggggsdgsgggwtkqvtcryfmhgvckegdncryshdlsdspy 
swckyfqrgyciygdrcryehsxplkqeeatatelttksslaa 

SSS LSS 1 VGPLVEMNTGEAESRNSNFATVGAGSEDWVNAIEFVP 
GQP YCGRTAPS CTEA PLQGSVTKE2SE keqtavetkkqlcp yaa 
VGECRYGENCVYLHGDSCDMCGLQVLHPMDAAQRSQH I KSC I EA 
HEKDMELS FAVQRSKDMVCG ICMEWYEKANPSERRFGIIiSNCK 
HTY CLKCI RKWRS AKQ FESK 1 1 KSCPECRI TSNFVI PS E YWVEE 
KEEKQKLILKYKEAMSNKACRYFDEGRGSCPFGGNCFYKHAYPD 
GRRE EPQRQ. KVGTS S R YRAQRRNH FWEL I EERENSNPFDNDEEE 
VVTF3LGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6208 


2924 


1471 


T VMAEAATPGTTATTSGAGAAAATAAAAS PTPI PTVTAPSLGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKYFQRGYCIYGDRCRYEHSKPLKQEEATATELTTKSSLAA 
S SS LSS IVG P LVEMNTGE AESRNSNFATVGAGS EDW VNAI EFVP 
GQP Y CGRTAPSCTEAPLOGSVTKEE S EKEQTAVETKKQLCP YAA 
VGECRYGENCVYLHGDSCDMCGLQVLHPMDAAQRSQHIKSCIEA 
HEKDMELS FAVQRSKJDMVCGICMEWYEKANPSERRFG ILSNCN 
HT YCLKCI RKWRSAKQFES KI I KSCPECR I TSN FVT PS E YWVEE 
KEEKQKLILKYKEAMSNKACRYFDEGRGSCP FGGNCFY KHAYPD 
GRREEPQRQKVGTSSRYRAQRRNHFWELIEERENSNPFDNDEEE 
WTFELGEMLLMLLAAGGDDELTDS E DEWDLFHDELEDFYDLDL 


6209 


1758 


829 


ERLCFPCMQSKIYSYMSPNKCSGMRFPLQEENSVTHHEVKCOGK 
PLAG I YRKREEKRNAGNAVRSAMKSEEQKIKDARKGPLVP FPNQ 
KS EAAE PPICTP P SSCDSTNAAIAKQALKKP I KGKQAPRKKAQGK 
TQQNRKLTDF YP VRRSSRKS KAELQS EERKR IDELI ESGKE EGM 
KI DL IDGKGRG VI ATKQFSRGDFWE YHGDLI E ITDAKKRE ALY 
AQDP STGCYMYY FQ YLS KT YCVDATRETNRLGRLINHS KCGNCQ 
TKLHDIDGVPHLILIASRDIAAGEELLYDYGDRSKASIEAHPWL 
KH 


6210 


3761 


387 


IFGMSKLRMVLLEDSGSADFRRHFVNLSPFTITVVLLLSACFVT 
SSLGGTDKELRLVDGEIN KCS GRVE VKVQEEWGTVCNNGWSMEAV 
S V I CNQLGCPTAI KAPGWANS SAGSGRI WMDHVS CRGNESALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGRIE 
IKFQGRWGTVCDDNFNI DHAS VI CRQLE CGSAVS FSGSSNFGEG 
SGP I WFDDLI CNGNESALWNC KHQGWGKHNCDHAEDAGVI CS KG 
ADLSLRLVDGVTECSGRLBVR FQGEWGTI CDDG WDS YDAAVACK 
QLGCPTAVTAIGRWASKGFGHIWLDSVSCQGHEPAVWQCKHHE 
MGKHYCWHN^DAGVTCSDGSDLEIjRLRGGGSRCAGTVE VE I QRL 
LGKVCDRGWGLKEADWCRQLGCGSALKTS YQVYS KI QATNT WL 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c or r e spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C^Cysteine, D=Aspartic Acid, 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine , 
S=Serine, T-Threonine, V^Valine, 
W=Tryptpphan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FJjSSCNGNBTSLWDCKNWQWGGLTCDHYEEAKITCSAHREPRLV 
GGDI PCSGRVEVKHGDTWGS I CDSDFSLEAASVLCRELQCGT W 
S I LGGAH FGEGNG Q I WAEEFQ CEGHESH LS LCPVAPRPEGTCSH 

I EDAHVL CQQLKCGVALSTPGGAR FG KGNGQI WRHM FHCTGTEQ 
HMGDCPVTALGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPTIPEESAVACIESGQLRLVNGGGRCAGRVEIYHEGSWGTICD 
DSWDLSDAHWCROLGCGEAINATGSAHFGEGTGPIWLDENKCN 
GKESRIWQCHSHGWGQQNCRHKEDAGVICSEFMSLRLTSEASRE 
ACAGRLEVF YNGAWGTVG KS SMS ETTVGWCRQIjG CADKGK I N P 
ASLDKAMSIPMWVDNVQCPKGPDTLWQCPSSPWEKRLASPSEET 
W I TCDNKI RLQEG PTS CSGRVEI WHGGSWGTVCDDSWDIiDDAQV 
VCQQLGCGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSIiWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPQKATTGRSSRQSSFIA 
VG I LGWLLAI FVALFFLTKKRRQRQRLAVSSRGENLVHQIQYR 
EMNS CLNADDLDLMNS SGGHSEPH 


6211 


3761 


387 


I FGMSKLRM VI.LEDSGSADFRRHFVNLSPFTI T WLLLSACFVT 
SSLGGTDKELRLVDGEWKCSGRVEVKVQEEWGTVCNNGWSMEAV 
S VI CNQLGCPTAI KAPGWANSSAGSGRI WMDIIVSCRGNESALWD 
CKHIX5WGKHSNCTHO^DAGVTCSDGS3^1IjEMRIjTRGGNMCSGRIE 
IKFQGRWGTVCDDNFNIDHASVICRQLECGSAVSFSGSSNFGEG 
SGPI WFDDLI CNGNESALWNCKHQGWGKHNCDHAEDAGVI CSKG 
ADLS LRLVDGVTECSGRLEVRFQG EWGT I CDDG WDS YDAAVACK 
QLGCPTAVTAI GRVNAS KG FGH I WLDS VS CQGHE PAVWQ CKHHE 
WGKHYCNHNEDAGVTCSDGSDLELRLRGGGSRCAGTVEVEIQRL 
LG KVCDRG WG LKEADWCRQ LGCG SAL KT S YQ VY SKI Q ATNT WL 
FLS S CNGNETS LWDCKN WQWGGLTCDH YEEAKI TCSAHREPRLV 
GGD I PCSGRVEVKHGDTWGS I CDS DFSLEAAS VLCRELQCGT W 
S I LGGAHFGEGNGQIWAEEFQCEGHESHLSLCPVAPRFEGTCSH 
aRU v v v^£>k i x c, x. K-bVrKjKl FCKGRVELKTLGAWGSLCNSHWD 
IEDAHVIiCQQLKCGVALSTPGGARFGKGNGQIWRHMFHCTGTEQ 
HMG DC P VTALG AS L CPS EQ VAS V I C S GNQSQTLS S CNS S S LGPT 
RPTI PEESAVACI ESGQLRLVNGGGRCAGRVEIYHEGSWGTICD 
DS WDLSDAHWCRQLGCGEAI NATGS AHFGEGTGP I WLDEMKCN 
GKESRIWQCHSHGWGQQNCRHKEDAGVICSEFMSUUiTSEASRE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGWCKQIX5CADKGXINP 
ASLDKAMSIPMWDNVQCPKGPDTLWQCPSSPWEKRLASPSBET 
WI TCDNKI RLQEG PTS CSGRVE I WHGGS WGTVCDDS WDLDDAQV 
VCOGI»GCGPALKAFKRA.EFGOGTGPTWI^KVKr , Kr:MPQ<;T.wnr , r> 
ARRWGHSECGHKEDAAVN CTDI SVQKTPQKATTGRS SRQSS FI A 
VGILGWLLAJ FVALFFLTKKRRQRQRLAVSSRGENLVHQIQYR 
EMNS CLNADDLD LMNSSGGHS EPH 


6212 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCSKPHLEKLTLGITRILESSPGVTEVTIIEKPPAERHMISSWE 
QKNNCVMPEDVKNFYLMTNGFHMTWSVKLDEHIIPI^GSMAINSI. 
S KLTQLTQS SMYS I* PMAPTLADLEDDTHEAS DDQ PE KPHFDS RS 
VI FELDSCNGSGKVCLVYKSGKPALAEDTEI WFLDRALYWHFLT 
DTFTA Y YRLLI THLGLPQWQYAFTS YG I S PQAKQRVSMYKP I T Y 
NTNLLTEETDSFWKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6213 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTN PGP PS S LRRAFRR 
RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCSKPHLEKLTLGITRILESS PGVTBVTI I EKPPAERHMISSWE 
QKNNCVMPED\nWFYI^TNGFHMTWSVKLDEHIIPLGSMAINSI 
S KLTQLTQS S MYSL PNAPTLADLEDDTHEASDDQPEKPH FDSRS 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methxonine , N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, v^Valine, 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








V I FELDSCNGSGKVCLVYKSGKPALAEDTE I WFLDRALYWHFLT 
DTFTAYYRIOjITHLGLPQWQYAFTSYGISPQAKQRVSMYKPITY 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6214 


2 


460 


HE1APSAIRRAARLGLGPARWQSRAAAFYFVRGFRTGWS FVGWV 
VLGTS AKRTRLF FFLS KMAAS SRAQVLALYRAMLRESKRFS AYN 
YRTYAVRR I RDAFRENKNVKDPVE I QTLVNKAKRDLGVIRRQVH 
IGQLYSTDKLI IENRDMPRT 


6215 


2 




RV21flflPT?f5<ir:<^A3^FTMPFTRVTPliGAGODVGRSCI L»VS IAGKNV 
MLDCGMHMG FNDDRRF PDFS Y ITQNG RTVTDFLD CVI I SHFHLDH 
CGALPY FSEMVGYDGP I YMTHPTQAI CPILLEDYRKIAVDKKGE 
ANFFTSQMI KDCMKKVVAVHLHQTVQVDDFXiE I KAYYAGHVLGA 
AMFQI KVG S ES VVYTGDYNMTPDRHLGAAW I DKCRPNLL I TEST 
YATTI RDSKRCRERDFI*KKVHETVERGGKVIjI PVFALGRAQELC 
ILLETFWERMNLKVPI Y FSTGLTEKANHYYKLFI PWTNQKIRKT 
FVQRNMFEFKHI KAFDRAFADNPGPMVVFATPGMLHAGQSLQIF 
RKWAGNEKNM VI M PGYCVQGTVGHKZ LSGQRKLEMEGRQVLEVK 
MQVE YMSFSAHADAKGI MQLVGQAE P ES VLLVHGEAKKME FLKQ 
KIEQELRVNCYMPANGETVTLPTSPS I PVGISLGLLKREMAQGL 
LPEAKKPRLIjHGTLIMKDSNFRLVSSEQAUCELGIiAEHQLRFTC 
RVHLHDTRKEQETAIjRVYS HLKSVLKDHCVQHLPDGS VTVE S VL 
LQAAAPSEDPGTKVLLVSWTYQDEELGSFIjTSIiliKKGIiPQAPS 


6216 


11 


393 


QTTRPEPRNSALRQSRSKMAVVGVSSVSRIjIjGRSRPQLGRPMSS 
riJxMrtPPR^AKMMKTI.TFFVATjPGVAVSMIilSIVYIjKSHHGEHERPE 
FIAYPHLRIRTKPFPWGDGNHTLFHWPHVNPIjPTGYEDE 


6217 


9 


1178 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRGEEGHDPKEPEQ 
LRKLFIGGLS FETTDDSLREHFEKWGTLTDCWMRDPQTKRSRG 
FGF VT YS CVE BVDAAMCARPHKVDGR WEP KRAVSREDS VKPGA 
HLTVKKI FVGG 1 KEDTEE YNIiRDY FEKYGKI ETI EVMEDRQSGK 
KRGFAFVTFDDHDTVDKIWQKYHTINGHNCEVKKALSKQEMQS 
AGSQRGRGGGS GH FMGRGGNFGGGGGNFGRGGN FGGRGG YGGGG 
GGSRGS YGGGDGG YNGFGGDGGN YGGGPGYS SRGG YGGGGPGYG 
NQGGGYGGGGGYDGYNEGGNFGGGNYGGGGNYWDFGNYSGQQQS 
NYGPMKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


621B 


13 OS 


906 


S CERRGF I MADDLKRFIiY KKL PS VEGLHAI WSDRDG VPVI KVA 
NDNAPEHAIiRPGFt»S TFAIiATDOGSKLGLS KNKS 1 1 CYYNTYQV 
VQFNRltPLWS F IAS S SANTGL I VSLE KELAPLFEEI»RQVVEVS 


6219 


2 


B90 


AGPGEGAGAGTRCAGAEAEMASAGGEDCESPAPEADRPHQRPFI* 
IGVSGGTAS GKS TVCEKIMELLGQNE VEQRQRKWI I>SQDR F YK 
VLTAEQKAKALKGQYNTOHPDAFDNDLMHRTLKNIVEGKTVEVP 
TYDFVTHSRLPETTWYPADWLFEG I LVF YSQEI RDM FHLRI*F 
VDTDSDVRLSRRVLRDVRRGRDLEQIIjTQYTTFVKPAFEEPCLP 
TKKYADVI I PRGVDNMVAINLIVQHIQDILNGDICKWHRGGSNG 
RSYKRTFSEPGDHPGMLTSGKRSHIjESSSRPH 


6220 


227 


764 


EQNISLEWSCTIEKAIiADAKALVERLRDHDDAAESIil EQTTALN 
KRVEAMKQYQEE IQELNEVARHRPRSTIiVMG IQQENRQIRELQQ 
ENKELRTS LEEHQSALELIMSKYREQMFRLLMASKKDDPGI IMK 
LKEQHS KI DM VHRNKS EGFFLDASRH I LEAPQHGUSRRHLEAN'Q 
NVH 


6221 


98 


916 


RWI WDLNPVSDGLELRPKYKGI LHCLTT I WKLDGLRGLYQGVTP 
NI WGAGLS WGLY FVFYNAI KS YKTEGRAERIjEATE YLVS AAEAG 
AMTLC I TN PI>WVTKTR LMLQYDAWNS PHRQ YKGMFDTLVKI YK 
YEGVRGLYKGFVPGLFGTSHGALQFMAYELLKIiKYKQHINRLPE 
AQLS TVE Y I SVAALS K I FAVAAT YP YQVVRARLQDQHMF YSGVI 
DVI TKTWRKEGVGG FYKG IAPNLI RVTPACC ITFWYENVSHFI* 
LDLREKRK 
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nucleotide 
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cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6222 


2 


2116 


MARELRAJjLLWGRRIiRPIiLRAPAIiAAVPGGKP I I»CPRRTTAQLG 
PRRNPAWSIiQAGRLFSTQTAEDKEEPLHSIISSTESVQGSTSKH 
EFQAETKKLLDI VARS LYSEKEVFIRELISNASDALEKLRHKLV 
SDGQALPEMEIHLQTNAEKGTITIQDTGIGMTQEELVSNXGTIA 
RSGSKAFLDALQNQAEASSKIIGQFGVGFYSAFMVADRVEVYSR 
SAAPGSLGYQWLSDGSGVFEIAEASGVRTGTKII1HLKSDCKEF 
S S E ARVRD WTK YSNF VS F PL YLNGRRMNTLQAI WM MDP KD VRE 
WQHBE FYR YVAQAHD KPR YTLH YKTDAPLNI RS I FYVPDMKPSM 
FDVSRELGSSVAIiYSRKVLIQTKATDILPKI^LRFIRGWDSEDI 
tr ah? uoKr.ii jj^tiij/uj x. k rs_UKU VtiQQRJu I Kr F IDQSKKDAEKYAKF 
FED YGLFMR3GI VTATEQEVKED I AKLLRYES SALPSGQLTS LS 
EYAS RMRAGTRNI YYLCAPNRHLAEHS PYYEAMKKKDTEVLFCF 
EQFDELTLLHLREFDKKKLISVETDIWDHYKEEKFEDRSPAAE 
CI>SEKETEEI^WMRim^SRVTNVKVTLRLDTHPAM^ , VI»EMG 
AARHFLRMQQLAKTQEERAQLLQPTLE I NPRHALI KKLNQLRAS 
EPGI^QI>LVDQIYENAMIAAGLVDDPRA>rVGRLNEIiLVKALERH 


6223 
| 6224 


3 
1 


715 
133 


DAWARTMAGMVDFQDEE Q VKS FLENME VE CN YHC YHEKD PDGCY 
RLVDYLEGIRKNFDEAAKVLKFKCEENQHSDSCYKLGAYYVTGK 
GGLTQDLKAAARCFLMACEKPGKKSIAACHNVGLLAHDGQVNBD 
GQPDLGKARDYYTRACDGGYTSSCFNIiSA>lFLQGAPGFPKDMDI» 
ACKYSMKACDLGHIWACANASPJ^YKLGDGVDKVRAKAEVIjKNRA 
QQVHKEQQKGVQPLTFG 


6225 


3259 


938 


LRTI SSHAWGPLLLTIjLAHCTGSWAQSVLTQP PSVSGARI phek 

llschriaicki^fsvesrktvmgpqgarrqafiafgdvtvdft 

OKERTRXiLSPAQRALYREVTLENYSHLVSLGIIjHSKPEIiIRRLEQ 
GEVPWGEERRRRPGPCAGIYAEHVLRPKNLGLAHQRQQQLQFSD 
QS FQSDTAEGQEKEKSTKPMAFSS PPLRHAVSSRRRNS WEIES 
SG^QRENPTElDKVIiKGIENSRWGAFKCAERGQDFSRKMMVI ih 
KKAHSRQKLFTCRECHQGFRDESALLIiHQNTHTGEKSYVCSVCG 
RGFSI.KANLLRHQRTHSGEKPFLCKVCGRGYTS ksyltvherth 
TGEKPYECQECGRRFNDKSSYNKHLKAHSGEKPFVCKECGRGYT 
NKSYFWHKRIHSGEKPYRCQECGRGFSNKSHLITHQRTHSGEK 
PFACRQCKQS FS VKGSLLRHQRTHSGEKPFVCKDCERS FSQKST 
LVYHQRTHSGEKPFVCRECGQGFIQKSTLVKHQ ITHSEE K?FVC 
KBCGRGFIQKSTFTLHQRTHSEEKPYGCRECGRRFRDKSSYNKH 
LRAHLGEKRFFCR.DCGRGFTLKPNLTIHQRTHSGEKPFMCKQCE 
KSFSLKANLLRHQWTHSGERPFNCKDCGRGFILKSTLLFHQKTH 
SGEKPFICSECGQGFIWKSNLVKHQLAHSGKQPFVCKECGRGFN 
WKGNLLTHQRTHSGEKPFVCNVCGQG FSWKRS LTRHHWR I HS KE 
KPFVCQECKRGYTSKSDIiTVHERIHTGERPYECQECGRKFSNKS 
YYSKHLKRHLREKRFCTGSVGEASS 


6226 


29 


266 


TKVSELLGGSQRLFFI>PI,WRRLCRCGU5PRVSP^GPRVEVIX3S 
IMEGGGQSLRVSTGLSWLLSLPWRAQRIRAGRSYA 


6-227 " 


2581 


890 


i'ioy-io oiji-it^Mf rxs ^vyi^tii) VttyKULibN L)UU FE? YLtS PQ AR P 
NNAYTAMSDS YLPS YYS PS I G FS YS LG E AAWS TGGDTAMP YLTS 
YGQLSNGEPKFLPDAMFGQPGALGSTPFLGQHGFN^FPSGIDFS 
AWGNNS SQGQS TQS SG YSS N YA YAPSS LGGAM 1 DGQSAEANETI* 
NKAPGMNTIDOGMAAIjKIjGSTEVASNVPKVVGS AVGSGS I TSNI 
VASNSLP PAT I AP P KPAS WAD I ASKPAKQQ PKLKTKNGIAGSSIj 

ppppikhnmdigtwdnkgpvakapsqajlvqnigqptqgspqpvg 
qqannsppvaqasvgqqtqplpppppqpaqlsvqqqaaqptrwv 
aprnrgsgfghngvdgngvgqsqagsgstpsephpvlekiirsin 
nynpkdfdwnlkhgrvfi i ks yseddi hrs i kyn i wcstehgwk 
rldaayrsmngkgpvyiilfsvngsghfcgvabmksavdyntcag 

VWSQD K WKGRFDVRW IFVKD VPNSQLRH I RIiENNETJKPVTNSRD 
TQEVPLEKAKQVLKI IASYKHTTS IFDDFSHYEKRQ 
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ID 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W= Tryptophan , Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6228 


47 


1978 


GRRCRRRGAVMELAQEARELGCWAVEEMGVPVAARAPESTLRRL 
CLGQGADIWAYIIiQHVHSQRTVKKIRGNLLWYGHQDSPQVRRKL 
ELEAAVTRLRAEIQELDQSLELMERDTEAQDTAMEQARQHTQDT 
QRRALLLRAQAG AMRRQQHTLRD PMQRLQNQLRRLQDMER KAKV 
DVTFGSLTSAALGLEPWLRDVRTACTLRAQFLQNLLLPQAKRG 
SLPTPHDDHFGTS YQQWLjS S VETLLTNHP PGHVIiAALEHLAAE R 
EAE I RSLCS GDGIiGDTEI S RPQA PDQSDS SQTLPSM VHL IQEG W 
RTVGVLVSQRSTKbKERQVbTQRLCXSLVEEVERRVLGSSERQVL 
ILGLRRCCLWTELKAIiHDQSQELQDAAGHRQLLLRELQAKQQRI 
LHWRQLVEETQEQVRLLI KGNSASKTRLCRS PGEVLALVQRKW 
PTFEAVAPQSRELLRCLEEEVRHLPHILLGTLLRHRPGELKPLP 
TVLPSIHQLHPASPRGSSFIALSHKLGLPPGKASELLLPAAASL 
RQDLLLLQDQRSLWCWDLLHMKTS LP PG LPTQELLQ I QASQEKQ 
QKENLGQAbKRLEKLLKQALERI PEI.QG I VGDWWEQPGQAALSE 
ELCQGLSLPQWRLRWVQAQGALQKLCS 


6229 


1571 


560 


GPS LliGTRGTPNPARTLQIFFL I IGRRLTGRMAAVDDLQFEEFG 
NAATSLTAN PDATTVN I EDPGETPKHQPGS PRGSGREEDDELLG 
NDDSDKTELLAGQKKSS PFWTPE YYQTFFDVDTYQVFDR I KGSL 
LP I PGKNFVRLYIRSMPDLYGP FWI CATLVFAIAISGNLSNFLI 
HLGEKTYHYVPEFRKVS IAATI I YAYAWLV PLALWGFLMWRNS K 
VMNIVSYSFLEIVCVYGYSLFI YI PTAILWI I PHKAVRWILVMI 
ALG ISGSLLAMTFWPAVREDNRRVALATIVTI VLLHMLLSVGCL 
AYFFDAPEMDHLPTTTATPNQTVAAAKSS 


6230 


1723 


600 


SKMSGRSGKKKMSKLSRSARAGVIFPVGRLMRYLKKGTFKYRIS 
VGAPVYMAAVI E YLAAE I L ELAGNAARDN KKAR IAPRHI LLAVA 
NDEELNQLLKGVTIASGGVLPRIHPELLAKKRGTKGKSETILSP 
PPEKRGRKATSGICKGGKKSKAAKPRTSKKSKPKDSDKEGTSNST 
SEDGPGDGFT I LSSKS LVLGQKLS LTQSD I SH I GSMRVEGI VHP 
TTAEI DLKED IGKALEKAGGKEFLETVKELRKSOGPLEVAEAAV 
SQSSGLAAKFVIHCHI PQWGSDKCEEQLEETI KNCLSAAEDKKL 
KSVAFPPFPSGRNCFPKQTAAQVTLKAISAHFDDSSASSLKNVY 
FLLFDSES IGI YVQEMAKLDAK 


6231 


149 


870 


LI FS S STMDRS LRNVLWS FG FLLLFTAYGGLQS LQSS L Y S E EG 
LGVTALSTL YGGMLLSSMFLPPLL I ERLGCKGT 1 I L5MCG YVAF 
SVGNFFASWYTLI PTS I LLGLGAAPLWS AQCTYLTI TGNTHAEK 
AGKRGKDMVNQ YFG I FFL I FQS SGVWGNLI S 5LVFGQTPSQETL 
PEEQLTSCGASDCLMATTTTNSTQRPSCX3LVYTLLGIYTGSGVL 
AVLMI AAFLQP I RDVQRESE 


6232 


3679 


1476 


FVAGTTMAGFWVGTAPLVAAGRRGRWPPQQLMLSAALRTLKHVL 
YYSRQCLMVSRNLGSVGYDPNEKTFDKILVANRGEIACRVIRTC 
KKMG I KTVAIH S D VDAS S VHVKMADE AVCVGPAPTSKSYLNMDA 
IMEAI KKTRAQAVHPG YG FLSENKEFARCLAAEDWFIG PDTHA 
IQAMGDKI ES KLLAKKAEVNTI PG FDGWKDAEEAVR IARE I G Y 
PVM I KAS AGGGGKGMR I AWDDEETRDGFRLSSQEAASSFGDDRL 
LIEKF IDNPRHI E I QVLGDKHGN ALWLNERECS I QRRNQKWEE 
APS I FLDAETRRAMGEQAVALARAVKYSSAGTVEFLVDS KKNFY 
FLEMNTRLQVEHP VTEC I TG LDLVQEM I RVAKG Y PLRHKQAD I R 
I NGWAVECRVYAEDP YKS FGLPS IGRLSQYQE PLHLPGVRVDSG 
IQPGSDIS I Y YDPM I SKLIT YGSDRTE ALKRMADALDNYVIRGV 
THNIALLREVI INSRFVKGD I STK FLS DVYPDGFKGHMLTKS EK 
NQLIAIASSLFVAFQLRAQHFQENSRMPVIKPDIANWELSVKLH 
DKVHTWASNWGSVFS VEVDGSKLNVTSTWNLAS PLLSVS VDGT 
QRTVQCLS REAGGNMS IQFLGTVYKVNI LTRLAAELNKFMLEKV 
TEDTS S VLRS PMPGVWAVS VKPGDAVAEGQEI CVI EAMKMQNS 
MTAGKTGTVKSVHCQAGDTVGEGDLLVELE 


6233 


1 


2654 


HSTRENLNAGNPN F PSEGHLVRSTGPGGS FAKHMVAQCVSPKGP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K^Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W~Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LACSRT YFFGATHVP YLGGDS KLPKKTEQI RLLSQ I YAAVI EAV 
LAGI ACYAKTSS LTKAKEVAEQTLGSGliDS FELI PFKAALRSKM 
TFHIHAVNNQGRIVPIjDSEDSIjSFVKTACKAVYDIPDIjLGGNGC 
LGSWFSESFLTSQILVKEKDGTVTTETSSWLTAAVPRFCSWL 
VEDNEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLYSSNLQSWP 
EEGNVHFFSSGLLFSHCRHGSI I ISKDHMNS ISFYDGDSTSTVA 
ALLI DF KS S LLPHLP VHFHG S SNFLM I ALFPKS K I YQAFYS E YF 
SLWKQQDNSGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPAG 
EKRSSLKLLSAKLPEI*DWFLQHFAISSISQEPVMRTHLPVLLQQ 
AE I NTTHRI ESDKVI I S I VTGLPGCHAS ELCAFLVTLHKECGRW 
MVYRQIMDSSECFHAAHFQRYLSSALEAQQNRSARQSAYIRKKT 
RLLWLQGYTDVIDWQALQTHPDSNVKASFTIGAITACVEPMS 
CYMEHRFLFPKCLDQCSQGLVSNWFTSHTTEQRKPLLVQLQSL 
IRAANPAAAFI LAENG I VTRNEDI ELI LSENSFSS PEMLRSRYL 
M YPG W YEGKLNAGS VY PLMVQ I CVWFGR PLEKTRF VAKCKAI OS 
S I KPS P FS GN I YH I LGKVKFS DS ERTME VC YNTLANS L SIM P VL 
EGPTPPPDSKSVSQDSSGQQECYLVFIGCSUCEDSIKDWLRQSA 
KQKPQRKALKTRGMLTQQE I RSI H VKR H LEPLPAG YFYNGTQFV 
NFFGDKTDFHPLMDQFmfDYVEEANREIEKYNQELEQQEYHDI.F 
ELKP 


6234 


1731 


404 


PRVREDMDHKSPGNKGSIiVYAGIKSIVKSSLGMVESSRHNWSGL 
DKQSDIQNLNEERI LALQLCGW I KKGTDVDVGPFLNSLVQEGBW 
ERAAAVALFNLD I RRAI QILNEGASSE KGDI^LNVVAMALSG YT 
DEKNSLWREI4CSTLRL0LNNPYLCVMFAFLTSETGSYDGVLYEN 
KVA VRDR VAFACKFLS DTQLNRY I EKLTN EM KEAGNLEG I LLTG 
LTKDGVDLMES YVDRTGDVQTAS YCMI .QGS PLDVLKDERVQYW I 
EN YRNLL DAWR FWHKRAEFD IHRS KLDP S S KPLAQVPVS CNFCG 
KSISYSCSAVPHQGRGFSQYGVSGSPTKSKVTSCPGCRKPLPRC 
AIjCLINMGTPVSSCPGGTKSDEKVDLSKDKKbAQFNNWFTWCHN 
CRHGGHAGHMLSWFRDHAECPVSACTCKCMQLDTTGNLVPAETV 
QP 


6235 


1 


571 


EKRDHRLPSWPRAALKVPGRGGRVGTTPELAAGGIMATRNPPPQ 
DYES ODDS YEVLDLTE Y ARRHQ W WN RV FGHS SGPMVE KYS VATQ 
IVMGGVTGWCAGFLFQKVGKLAATAVGGGFLLLQIASHSGYVQ I 
DWKRVEKDVNKAKRQI KKRANKAAPE INNLI EEATEFI KQNI VI 
SSGFVGGFLLGLAS | 


6236 


1 


703 


WDQNKGAAAGSGLTLPSLPSARFSAGPPTQRSRPTMSNMEKHLF 
NLKFAAKELSRSAKKCDKEEKAEKAKIKKAIQKGNMEVARIHAE 
NAI RQKNQAVNFLRMS ARVDAVAARVQTAVTMGKVT KSMAGVVK 
SMDATLKTMNLEKISALMDKFEHQFETLDVQTQQMEDTMSSTTT 
LTTPQNQVDMLLQEMADEAGLDLNMELPQGQTGSVGTSVASAEQ 
DELSQRLARLRDQV 


6237 


312 


720 


P TAMAEEG IAAGG VMD VNTALQE VLKTALI HDGLARG I REAAKA 
LDKRQAHLCVLASNCDE PMYVKLVEALCABHQINLI KVDDNKKX. 
GEWGLCKIDREGKPRKWGC^CVVVKDYGKESQAKDVIEEYFK 
CKK 


6238 


2 


4666 


EE VPTQES VKWE INVI IKNPEI VF VADMTKNDAPALVI TTQCEI 
CYKGNLENS TMTAAI KDLQVRACP FLPVKRKGKI TTVLQPCDLF 
YQTTQKGTD PQVI DMS VKS LTLKVS ? VI INTMITITSALYTTKE 
TI PEETASSTAHLWEKKDTKTLKMWFLEESNETEKIAPTTELVP 
KGEMI KMNI DS I FI VLEAGIGHRTVPMLLAKSRFSGEGKNW SSL 
INLHCQLELEVHYYNEMFGVWEPLLEPLEI DQTEDFRPWNLGI K 
MKKKAKMAIVESDPEEENYKVPEYKTVISFHSKDQLNITLSKCX5 
LVMIJJNLVKAFTEAATGSS ADFVKDLAP FM I LNSLGLT I SVS PS 
DSFSVLNIPMAKSYVLKNGESLSMDYIRTKDNDHFNAMTSLSSK 
LFFILLTPVNHSTADKI PLTKVGRRLYTVRHRESG VERS I VCQI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F-Phenyl alanine, G=Glycirie, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=sProline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DTVEGSKKVTI RSPVQ IRNHFS VPLSVYEGDTLLGTAS PENEFN 
I PLGSYRS FI FLKPEDENYQMCEGI DFEEI I KNDGALLKKKCRS 
KKPSKESFLINIVPEKDNLTSIjSVYSEDGVmLPYIMHLWPPILL 
RNLLP YKI AY Y IEG I ENS VFTLSEGHSAQ I CTAQLGKARLHLKL 
LDYLSHDWKS EYH IKPNQQDI S FVSFTCVTEMEKTDLDI AVHMT 
YNTGO/TVVAFHSPYWMVNKTGRMIiQYKADGIHRKHPPNYKKPVI* 
FS FQ PNHFFNNNKVQLMVTDSE LSNQ FS 1 DT VGSHG AVKC KGLK 
MDYQVGVTIDLSSFNITRIVTFTPFYMI KNKSKYHI SVAEEGND 
KWLSLDLEQCI PFWPEYAS SKLLI QVERSEDPPKRI Y KNKQENC 
I LLRLDNBLGG I I AE VNLAEHST VI TFLD YHDGAATFLL I NHT K 
NELVQYNQSS I»SE I EDSLP PGKAVFYTWADPVGSRRLKWRCRKS 
HGEVTQKDDMMMPIDLGEKTI YLVS FFEGLQR I ILFT EDPRVFK 

WWETKPKKKARWKPMSVKHTEKIiEREFKEYTESSPSEDKVIQL 
DTNVPVRL.TPTGHNMKILQPHVIALRRNYLPALKVEYNTSAHQS 
S FRIQI YRI Q I QNQI HGAVFPFVF Y P VKPP KS VTMDSAPKPFTD 
VS IVMRSAGHSQISRI KYFKVLIQEMDLRLDLGFIYALTDLMTE 
AEVTENTEVEliFHKDIEAFKEEYKTASLVDQSQVSLYEYFHISP 
I KLHLS VSLSSGREEAKDS KQNGGLI PVHSLNLLLKS IGATLTD 
VQDWFKLAFFELNYQFHTTSDLQSEVIRHYS KQAI KQM YVLII* 
GLDVLGNP FGL I REFSEGVEAFFYE P YQGAI QG PEEFVEGMAIG 
L KALVGGAVGG LAG AAS K I TGAMAKG VAAMTMDED YQQKRREAM 
NKQPAGFREG I TRGG KGLVSGFVSG ITG I VTKPI KGAQKGGAAG 
FFKGVGKGLVGAVAR PTGG I IDMASSTFQGIKRATETSEVESLR 
PPRr r NJilJvj V ± K tr I KJjKIAj 1 Xj\jt\x.\2ic i Kfinj.1 unoooouu 
DDDDDDDDESDLNH 


6239 


2108 


634 


KPGMAGKGSSGRHPLLLGLLVAVATVHLVICPYTKVEESFNLQA | 
THDLLYHWQDLEQ YDHLE FPGVVPRTFLGP WIAVFSS PAVYVL j 
SLLEMS KFYSQLI VRGVLGLGVI FGLWTLQKE VRRH FGAMVATM * 
FCWVTAMQFHLMFYGTRTLPNVLAL 

WLSAFAI I VFRVSL^IiFLGLLLLLALGNRKVSVVRALRHAVPAG 
II^LGLTVAVDSYFWRQLTWPEGKVLWYNT\OjKKSSNWGTSPLIj 
wvfv<: r«5T .T.PT Pt.GI»VDRRTHAPTVIiALGFKALYSLIj 
PHKELRFIIYAFPMLNITAARGCSYLLNNYKKSWLYKAGSLLVI 
GHLVVNAAYSATALYVSHFNYPGGVAMQRLHQLVPPQTDVLLHI 
DVAAAQTGVS R FLQVNS AW R YDKREDVQPGTGMLAYTH I LMEAA 
PGLLALYRDTHRVLAS WGTTGVSLNLTQLP PFNVHI^TKLVLIi 
ERLPRPS 


6240 


2202 


1176 


HERGDSLKEPTSIAESSRHPSYRSEPSLEPESFRSPTFGKSFHF 
DPLSSGSRSSSLKSAQGTGF3LGQLQSIRSEGTTSTSYKSLANQ 
TRNGSLSYDSLLTPSDSPDF3SVQAGPEPDPPLGYTSPFLSARI* 
AQQREAERHPRLVPTGPTHR3PS PVRYDNLSRH I VASLQEREKL 
LRQS P PLPGREEE PGLGDSG I QSTPGSGHAPRTS S S SDDS KRS P 
LGKTPLGRPAVPRFGKPDGLRGRGVGSPEPGPTAPYLGRSMSYS 
SQKAQ PG VSBTEE VAIjQ PLLTPKDEVQLKTT YSKSNGQPKSLGS 
ASPGPGQPPLSSPTRGGVKKVSGVGGTTYEISV 


6241 


3 


1341 


RNAEEKKRLSLQREKI IARVS I DNRTRALVQALRRTTDPKLCIT 
RVEELTFHLLEFPEGKGVAVKERI I PYLLRLRQI KDETLQAAVR 
EILAL IG YVDPVKGRGIR I LS I DGGGTRGWALQTLRKLVELTQ 
KPVHQLFDYI CGVS TG AI LAFMLGLFHMPLDECEEL YRKLGSD V 
FSQNVI VGTVKMS WSHAF YDSQTWEN I LKDRMGSALMIETARN P 
TCPKVAAVST IVNRGITPKAFVFRNYGHFPG INSHYLGGCQYKM 
WQAIRASSAAPGYFAEYAIiGNDI^QDGGLLLNI^SALAMHECKC 
LWPDVPLECIVS LGTGRYES DVRNTVTYTSLKTKLSNVINS ATD 
TEEVHIMLDGLLPPDTYFRFNPVMCENI PLDESRNEKLDQLQLE 
GLKYIERNEQKMKKVAKILSQEKTTLQKINDWIKLKTDMYEGLP 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
<A=Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F~ Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S^Serine, T=Threonine, v=valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 

FFSKL — 


6242 


198 


1310 


QHFLPGAETWSPGAAVCTARRFPGRSLAAFPRPAAPRRAVBMGE ' 
SS ED I DQM FS TLLGE M DLLTQS LG VDTLPPPDPNP PRAE FNYS V 
GFKDLNESLNALEDQDLDALMADLVADISEAEQRTIQAQKESLQ 
NQHHSASLQASIFSGAASL.GYGTNVAATGISQYEDDLPPPPADP 
VLDLPLPPPPPEPLSQEEEEAQAKADKIKIALEKLKEAKVKKLV 
VKVHMNDNSTKSLMVDERQIARDVLDNLFBKTHCDCNVDWCLiYE 
IYPELQIERFFEDHENWEVLSDWTRDTENKILFLEKEEKYAVF 
KNP ON FYLDNRG KKES KETNE KMNAXNKES LLE VRLI LQSGRKE 
KDVCS I FKS FASENKGKI 


6243 


1509 


614 . 


RSASRFSGCWSRDSTCCCCPSTCWSRSSASCPRARWPPSSAPAT 
TS RASSRRLACG PQTRAGAETRSTAMI RAKSAARDTRRATCRSA 
AGTPSPTTMTCLTDVPTGCAAVEPTARLPAAAWASTITTGCCPA 
MGQAGAGPAGRKGS EAGGG PGRAHKAHPSPLPREPRVRTG ? PAH 
SPTPGSIDPSPELSWGSAGVTQESPLLDPVDFLLFRTRAVDPLR 
RVFFFFYQHLTFFSIQPQPPPCHAFHPRDPPAGTKRQLILVPLK 
GPPILAPILSLTPILSRWSCYFPRSRIAQGWHLS 


6244 


2119 


1745 


FEHAYASQFGTFLGNNESERCKLKLQQKTMSLWSW\WQPSELSK 
FTNPLFEANNLVIWPSVAPQSLPLWEGIFLRWNRSSKYLDEAYE 
EMVNI IEYNKELQAKVNILRRQLAELETEDGMQESP 


6245 


81 


1148 


LSLRNAKYSFPQELISLFSMTDLNDNICKRYIKMITNIVILSLI 
I CISLAFWI I SMTASTY YGNLRPIS PWRWLFS VWP VLI VSNGL 
KKKSLDHSGAIX3GLWGFILTIANFSFFTSLLMFFLSSSKLTKW 
KGEVKKRLDS E YKEGGQRNW VQVFCNGAVPTEIALLYK1 ENG PG ' 
E I P VD FS KQYS AS WMCLS LLAALACSAGDTW AS E VGP VLSKS SP i 
Rltl TTWEKVP VGTNGGVTWGLVS SKLGGTF VGI AYFLTQLI FV 
NDLDISAPQWP I IAFGGLAGLLGS IVDSYLGATMQYTGLDESTG 
MWNS PTNKARHI AGKP ILDNNAVNLFS S VL IALLL PT AAWG FW 
PRG 


6246 


1177 


3S9 


SLWPV/IbMDDSJLMQISIiQLLCVYTANFPNGCSSLCWSSCGQHPV 
QATHRGAVSNSLMLCI LKIASQMPLENTTVQQIWFMLLSNLALS 
HDCKGVIQKSNFLQNFLSIAIjPKGGNKHLSNIiTILWIjKLLLNIS 
SGEDGQQMIIiRIjDGCIjDLLTEMSKYKHKSSPLLPLLIFHNVCFS 
PANKPKI LANEKVI TVLAACDESENQNAQRIGAAALWALI YNYQ 
KAKTAI#KSPSVKRRVDEAYSIiAKKTFPNSEANPI*NAYYLKCIfEN 
LVQLLNSS 


6247 
6248 "" 


3 


1678 


NSRVWGPWTEPSAGSI^PMARKQNRNSKELGLVP3bTDDTSHAGP~ 
PGPGRALLECDHLRSGVPGGRRRKDWSCSLLVASLAGAFGSS fl 
YGYNLSVVNAPTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 

S I faigglvgtli vkmigkvlgrkhtllanngfai saallmacs 
i^agafemlivgrfimgidggvalsvlpmylseispkeirgslg 
qvtaificigvftgqllglpellgkestwpylfgviwpawqi, 
lslpflpdsprylllekhnearavkafqtflgkahvsqeveevl 

AESRVQRS IRLVS VLELLRAP YVRWQ VVTVI VTMAC YQLCGLNA 
IWFYTNS I FGKAGI P PAKI P YVTLSTGG IETLAAVFSGLVIEHIj 
GRRPLLIGGFGLMGLFFGTLTITLTLQDHAPWVPYLSIVGILAI 
IASFCSGPGGIPFI LTGEFFQQSQRPAAF1 IAGTVNWLSNFAVG 
LZj FPF 1 QKSLDT YCFL VFATI CI TGA I YL YFVLPETKNRTYAE I 
SQAFS KRNKAYPPEEKI DSAVTDGKINGR P 




56 


1773 


VP PPRMMAAVP PGLE PWNRVR I PKAGNRS AVTVQNPGAALDLCl 
AAVI KECH LVI LS I»KS QTLDAETD VL CAVLY SNHNRMGRHKPHL 
ALKQVEQ CLKRLKNMNLEGS I QDLFELFS SNENQPLTTKVCWP 
SOPWELVLMKVLGACKLLLRLLDCCCKTFLLTVKHLGLQEFII 
LNLVMVGLVS RLWVL YKGVLKRL I LLYEPLFGLLQE VARIQ PM P 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLLNKLF 
LINEQS PRAS EETLLG I SKKAKQMKINVQNNVDLGQ P VKNKRVF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
loca t i on 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A== Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidm6 » I—Isoleucine , K=L»ysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VIGTPHAKSFVQRFREAESFTQLSEE1QMAWWCRSKKLKAQAI 
FLGNKLLKSNRLKHLEAQGTSLPKKLECI KTS ICNHLLRGSGIK 
TSKHHLRQRRSQNKFIjRRQRKPQRKLQSTLLREIQQFSQGTRKS 
ATDTSAKWRLSHCT VHRTDLY PNSKQLLNSGVSMPVI QTKEKM I 
HENLRG IHENETDS WTVMQ INKNSTSGTI KETDDIDDI FALMGV 


6249 


56 


1773 


VP P P RMMAAVP PGLE P WNRVR I P KAGNRS AVT VQNPGAALDLC I 
AAVI KECHLVl LSLKSQTLDAETDVLCAVLYSNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGSIQDLFELFSSNENQPLTTKVCVVP 
SQPWELVLMK^LGACKLLLRLLDCCCKTFLLTVKHLGLQEFII 
LNLVMVGLVSRLWVLYKGVLKRL ILLYBPLFGLtiQEVAR I QPM P 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLIiNKLF 
L INEQSPRAS EETLLGI SKKAKQMKI NVQNNVDLGQ PVKNKRVF 
KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VI GT PHAKS FVQR FREAESFTQLS EE IQMAWWCRS K KLKAQAI 
FLGNKLLKSNRLKHLEAQGTSLPKKLECI KTSI CNHLLRGSGIK 
TSKHHLRQRRSQNKFLRRQRKPQRKLQSTLLREIQQFSQGTRKS 
ATDTSAKWRLSHCTVHRTDLYPNSKQLIjNSGVSMPVIQTKEKMI 
HENLRGIHENETDSWIVMQINKNSTSGTI KETDDIDD I FALMGV 


6250 


232 


1306 


LAALH IMALPFRKDLEKYKDLDEDELLGOTiSETELKQLETVLDD 
LD PENALLP AG FRQKNQTSKSTTG P FDREHLLS YLEKEALEHKD 
RED YVPYTGEKKGKI FI PKQKPVQTFTEEKVSLDPELEEALTSA 
SDTELCDLAAILGMHNLITNTKFCNIMGSSNGVDQEHFSNVVKG 
EKI LP VFDE P P NPTNVE ESLKRT KENDAHLVE VNLNN I KN I PI P 
TL KD FAKALETNTHV KC FS LAATRS NDP VATAFAE MLKVNKTLK 
SLNVESNF I TGVG ILAL IDALRDNETLAELKI DNQRQQLGTAVE 
LEMAKMLEENTNILKFG YQFTQQG P RTRAANAI TKNNDLVRKRR 
VEGDHQ 


6251 


62 


972 


TPGSGPMSAWAAASLSRAAARCLLARGPGVRAAPPRDPRPSHPE 
PRGCGAAPGRTLHFTAAVPAGHNKWS KVRHI KGPKDVERSRI FS 
KLCLN I RLAVKEGGPNP EHN S N LAN I LEVCRSKHMPKST I ETAL 
KMEKS KDT YLLYEGRGPGGSSLL I EALSNSSHKCQADIRH ILNK 
NGGVMAVGARHS FDKKGVIWEVEDREKKAVNLERALEMAIEAG 
AEDVKETEDEE ERNVFKF I CDAS SLHQVR KKLDS LGLCS VS CAL 
EFIPNSKVQLAEPDLEQAAHLIQALSNHEDVIHVYDNIE 


6252 


27 


1897 


EE FCTW I AVR VGEMETAP KPGKD VP P KKDKLQT KRKKPRRYWEB 
ETVPTTAGASPGPPRNKKNRELRPQRPKNAYILKKSRISKKPQV 
PKKPREWKNPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHS KAKTRSRLEVAEAEEEETS I KAARSELLLAEEPG FLEGE 
DGEDTAKI CQAD I VEAVDIASAAKH FDLNLRQFG P YRLNYS RTG 
RHLAFGGRRGHVAALDWVTKKLMCE INVMEAVRDIRFLHSEALL 
AVAQNRWLHIYDNQGIELHCIRRCDRVTRLEFLPFHFLLATASE 
TGFLTYLDVSVGKIVAALNARAGRLDVMSQNPYNAVIHLGHSNG 
TVSLWS PAMKK PT.AK T IjCHRGGVRAVAVDSTGTYMATSGLDHOT • 
KIFDLRGTYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDWNIWA 
GQGKASPPSLEQPYLTHRLSGPVHGLQFCPFEDVLGVGHTGGIT 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPABLIC 
LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGAR PS 
ALDRFVR 


6253 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQTKRKKPRRYWEE 
ETVPTTAGASPGPPRNKKNRELRPQRPKNAYILKKSRISKKPQV 
PKKPREWKN PESQRGLSGAQDPFPGPAP VP VEWQKFCR I DKSR 
KLPHSKAKTRSRLEVAEAEEEETS I KAARSELLLAEE PGFLEGE 
DGEDTAKI CQADI VEAVDIASAAKHFDLNLRQFGPYRLNYSRTG 
RHLAFGGRRGHVAALDWVTKKLMCEINVMEAVRDI RFLHSEALL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F-Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
I>= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AVAQNRWLH I YDNQG I ELHCI RRCDRVTRLEFLPFHFLLATASE 
TGFLTYLDVSVGKIVAALNARAGRLDVMSQNPYNAVIHLGHSNG 

TV^LWS pam ke plaki lchrggvravavdstgt YMATSGLDHQL 

KIFDLRGTYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDWNIWA 
GQGKAS P P S LEOP YliTHRLSG PVHf?T iD PP P FFnvT r» vr» u*rr=i^ t t 

SMLVPGAGE PNFDGLESNPYRS RKQRQEWEVKALLEKVPAELI C 
LDPRALAEVDVISIiEQGKKEQIERLGYDPQAKAPFQPKPKOKGR 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6254 


155 


1139 


HALGRRGGSQELSAAACGCFA1.RLRAPGSGRPALAPGAAAFAGL 
GGAPRFPPRGSAAGRTMLLKEYRI CMPLTVDE YKIGQLYM I SKH 
S HEQSDRGEGVE WQNE P FEDPHHGNGQFTE KR V YLNS KL P S WA 
RAWPKIFYVTEKAWNYYPYTITEYTCSFLPKFSIHIETKYEDN 
KGSNDTI FDNEAKDVEREVCFIDIACDEI PERYYKESEDPKHFK 
SBKTGRGQLREGWRDSHQPIMCSYKLVTVKFEVWGLQTRVEQFV 
nAVV ± Ltii x VjnKUAr AW V Dr. W YDMTMDDVRE YEKNMHEQTNI K 
VCNQHSS PVDDI ESHAQTST 


6255 


1 


1444 


PTRPQQELLVSIiATVIFVASQKALSVESKAVIKQQLESVSNGWT 
VYR I ARQASRMGNHDMAKELYQSLLTQ VAS KH F YFWLNS I»KEFS 
HAEQCLTGLQEENYSSALSCIAESLKFYHKGIASLTAASTPLiNP 
LS FQCEFVKL R I DKLQAFSQL 1 CTCNS LKTS P P PAI ATT I AMTL 
GNDLQRCGRI SNQMKQ S MEEFRSIiASR YGDI* YQAS FDADS ATLR 
NVEIjQQQSCLLISHAIEALILDPESASFQEYGSTGTAHADSEYE 
RRMMSVYNHVLEEVESLNGKYTPVSYMHTACLCNAIIAIiLKVPL 
SFQRYFFQKLQSTS I KXALS P S P RN P AE P IA VQNNQQLALK VEG 
WQHGS KPGL FRKI QS VCLNVSSTLQS KSGQDY KI PIDNMTNEM 
EQRVE PHNDYFSTQFLLNFAI LGTHN I TVES S VKDANG I VWKTG 
Fn * *r vivoiiCiur x Q'w^iKij^^^yAyy rixjQvQQRNAYTRF 


6256 


1 


1542 


CRGAGAEPAANPRSPRSLVPSLESTSTSVPPAPGTMATDSWAlaA 
VDEQEAAAESLSNLHIjKEEK I KPDTNGAWKTNANAEKTDE EEK 
EDRAAQSLLNKLI RSNLVDNTNQVE VLQRDPNS PLYSVKSFEEL 

TGKTAAFVLAMLS Q VE PANKYPQCLCLS PTYE LALQTGKVI EQM 
GKFYPELKLAYAVRGNKLERGQKISEQIVIGTPGTVLDWCSKIiK 
FIDPKKI KVFVLDEADVM I ATQGHQDQS I R I QRML PRNCQMLLF 
SATFEDS VWKFAQKWPDPNVI KLKREEETLDTIKQYYVIiCS SR 
DEKFQALCNLYGAITI AQAMI FCHTRKTASWIiAAELSKEGHQVA 
LI^GEMMVEQRAAVIEllFREGICEKVLVTTNVCARG IDVEQVSW 
INFDL PVDKDGN PDNETYLHR I GRTGR FGKRGIjAVWMVDSKHSM 
NILNR IQEHFNKKIERLDTDDLDEIEKIAN 


6257 

CO C o 


210 


615 


AFIPAMAELIQKKLQGEVEKYQQLQKDLSKSMSGRQKLEAQLTE 
^IVKEELALLDGSNWFKLLGPVLVKQELGET^TVGKRLDYI 
TAEI KRYESQLRDLERQSEQQRETLAQLQQEFQRAQAAKAGAPG 
KA 




210 


615 


AFI PAMAELIQKKLQGEVEKYQQLQKDLS KSMSGRQKLEAQLTE 
NNIVlOSEIJU^LDGSWVFKLI^PVLVKQEl^EARATVGiaiLDYI 
TAEIKRYESQLRDLERQSBQQRETLAQLQQEFQRAQAAKAGAPG 
KA 


6259 


2 


154 0 


ILEKGFPSQCHPERKWKVDDVLESSQENEDDHFWELLFHNNKTV 
S VENGDRGSKTFNLGTDPVSItRNYPYKI CDSCEMNhKNISGh 1 1 
SKKNCSRKKPDEFNVCEKLLLDIRHEKIPIGEKSYKYDQKRNAI 
NYHQDLSQPS FGQS FE YS KNGQGFHDEAAFFTNKRSQ I GETVCK 
YNECGRTFIESLKLNISQRPHLEMEP YGCS ICGKS FCMNItRFGH 
QRALTKDNPYEYNEYGEIFCDNSAFIIHQGAYTRKILREYKVSD 
KTWEKSALLKHQ I VHMGGKS YD YNENGSN FS KKSHI*TQI*RRAHT 
GEKTFECGECGKTFWEKSNLTQHQRTHTGEKPYECTECGKAFCQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl eot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

Glutamic Acid, F= Phenylalanine , G=Glycine, 
HsHistidine, I^Isoleucine, K=Lysine, 
L= Leucine , M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Trypcophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPHLTNHQRTHTGE KP Y ECKQCGKTFCVKSNLTEHQRTHTGEKP 
YE CNACGKS FCHRS ALTVHQRTHTGE KPFI CNECGKS FCVKSNL 
IVHQRTHTGEKPYKCNECGKTFCEKSALTKHQRTHTGEKPYECN 
ACGKTFSQRS VLTKHQR IHTRVKALSTS 


6260 


2081 


1436 


GTGPEIHACAHASARAPGSRAMAL.RELKVCLLGDTGVGKSSIVW 

P"R"\/P*T™>CT7'TH}'MTMiyT l t r* a C ITMT VT\fr» VnMPT UVCT T UTVPIV r>nt?D T7 

t\r V iLUzzr Llt'NXvttr 1 luAol? 1"1 L3S. 1 v\J X\JNttljriiS.r liXWO rA(juc>Kr 
RALAPWYYRGSAAAIIVYDITKEETFSTLKNWVKELRQHGPPN1 
WAI AGNKCDLIDVRE VMERDAKDYADS IHAI FVETSAKNAINI 
NEI*F I E I S RRI PSTDANL PSGGKG FKLRRQPS EP KRS CC 


6261 


3 


1188 


FWYRLGPGTRSRWPRRGSWAASLVPRGPSPAALVTSPCPPDPLR 
SPACEPCRPDFAPRPALLLRSGPRSAPAVTGKPALKGQPGPWPG 
MAEVSI DQS KLPGVKEVCRDFAVLEDHTLAHSLQEQE I EHHLAS 
NVQRNRIjVQHDLQVAKQLQEEDLKAQAQLQKRYKDLEQQDCEIA 
QE I QEKLAI EAERRRIQEKKDEDI ARLLQEKELQEEKKRKKHFP 
EFPATRAYADSYYYEDGGMKPRW1KEAVSTPSRMAHRDQEWYDA 
E I ARKLQEEELliATQVDMRAAQVAQDE E IARLLMAEE KKAYKKA 
KEREKSSLDKRKQDPEWICPKTAKAANSKSKESDEPHHSKNERPA 
RPPP P I MTDGEDAD YTH FTNQQS S TRHFSKSES SHKGFH YKH 


6262 


| 2 


1759 


P ECHS QGLCS VHRPGKVP OARMSGLVLGQRDE PAGHRLSQ E EI JL 
GSTRIjVSQG LEALRS EHQAVLQST iS QTI ECLQQGGHEEGLVHEK 
ARQLRRSMENIELGLSEAQVMLALASHLSTVESEKQKLRAQVRR 
LCQENQWLRDELAGTQQRLQRSEQAVAQLEEEKKHLEFLGQLRQ 
YDEtKSHTSEEKEGDATKDSLDDLFPNEEEEDPSNGLSRGQGATA 
AQQGG YE I PARIiRTLHNIjVI QYAAQGRYEVAVP LCKQALEDLER 
TSGRGHPDVATMLNI LAliVYRDQNKYKEAAKLIiNDALS IRESTL 
G PDH P AVAATLNNLAVL YGKRGKYKEAEPLCQRAIjE IRE KYIX5T 
NHPDVAKQI^LALLCQNQGKYEAVERYYQRALAIYEGQLGPDN 
PNVARTKNNLASCYLKQGKYAEAETLYKEILTRAHVQEFGSVDD 
DHKPIWMHAEEREEMSKSRHHEGGTPYAEYGGWYKACKVSSPTV 
NTTLRNLGALYRRO^KLEAAETLEECALRSRRQGTDPISQTKVA 
ELLGESDGRRTSQEGPGDSVKPEGGEDASVAVEWSGtXSSGTLQR 
SGSLGK I RD VLRR 


6263 


1 


2408 


REIiDSIADLPERI KPP YANGLSTSHIiRSSSVEDVKLl I SEGRPT 
IEVRRCSMPSVICEHTKQFQTISEESNQGSLLTVPGDTSPSPKP 
EVFSNVPERDLSNVSNI HS S FATS PTGASNSKYVS ADRNL I KNT 
APVWTVMDSPVHLEPSSQVGVIQNKSWEMPVDRLETLSTRDFIC 
PNSN I PDQESS LQS FCNSENKVLKENADFLSLRQTELPGNS CAQ 
DPASFMPPQQPCS FPSQSLSDAES 1 S KHMSI*S YVANQEPG I IiQQ 

MN/WSJX Lo&J\LiU lUci&£> ISteJl CtNXc VJLAriJVyK.1 UAr VPV YoDbl 

IQEASPNFEKAYTLPVLPSEKDFNGSDASTQLNTHYAFSKLTYK 
SSSGHEVENSTTDTQVISHEKENKLESLVLTHLSRCDSDLCEMN 
AGMPKGNLNEQDPKHCPES E KCUjS I EDEESQQS I LSSLENHSQ 
QSTQPEMHKYGQLVKVELEENAEDDKTENQIPQRMTRNKANTMA 
NQSKQIIiASCTLLSEKDSESSSPRGRIRLTEDDDPQIHHPRKRK 
VSRVPQPVQVS PS LliQAKEKTQQSLAAI VDSLKLDE IQPYSSER 
ANPYFEYIiHIRKKIEEKRKLLCSVIPQAPQYYDEYVTFNGSYLL 
DGNPLSKICIPTITPPPSLSDPLKELFRQQEWRMKLRLQHSIE 
REKLIVSNEQEVURVHYRAARTLANQTLPFSACTVLLDAEVYNV 
Pl^SQSDDSKTSVRDRFKARQFMSWLQDVDDKFDKIiKrCIJjMRQ 
QHEAAAIjNA VQR LEWQIiKLQELDPAT Y KSIS I YEI QEFYVPLVD 
VNDDFELTPI 


6264 


143 


1960 


KHRQEMNALDMAPE IHMTG PMCLIENTNGEZiVANPEALKI I»SAI 
TQPWWAIVGLYRTGKSYLMNKLAGKNKGFSLGSTVKSHTKGI 
WMWCVPHPKKPEHTLVLLDTEGLGDVKKGDNQNDS WI FTLAVLL 
SSTLVYNSMGTINQQAMDQLYYVTEIjTHRIRSKSS pdeneneds 
AD FVS FFPDFVWTLRDFS LDI*EADGQPLTPDEYLE YSLKXTQGT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corr e spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine / C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine. 
L=Leucine, M-Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V=Valine 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQKDKNFNLPRLCIRKFFPKKKCFVFDLPIHRRKLAQLEKLQDE 
ELDPEFVQQVADFCS Y I FSNS KTKTLSGG I KVNGPRLESLVLTY 
INAISRGDLPCMENAVLALAQIENSAAVOKAIAHYDQQMGQKVQ 
LPAETLQELLDLHRVS EREATEVYMKNS FKDVDHLFQKKLAAQL 
DKJCRDDFCKQNQEASSDRCSALLQVIFSPLEEEVKAGIYSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQ ILTE KEKEI EVECVKAESAQASAKMVEEMQI KYQQMMEEK 
E KS YQEHVKQLTEKMERERAQLLEEQEKTLTSKLQEQAR VLK ER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6265 


143 


1960 


KHRQENNALDMAPEIHMTGPMCLIENTNGELVANPEALKILSAI 
1 yFWWAI VGLYRTGKS YLMNKLAGKNKGFSLGSTVXSHTKGI 
WMWCVPHPKKPEHTLVLLDTEGU3DVKKGDNQNDSWI FTLAVLL 
SSTDVYNSMGTINQQAMDQLYYVTELTHRIRSKSSPDENENEDS 
ADFVSFFPDFVWTIjRDFSLDLEADGQPLTPDEYIiEYSLKLTQGT 
SQKDKNFNLPRLCIRKFFPKKKCFVFDLPIHRRKLAQLEKLQDE 
ELDPEFVQQVADFCS YI FSNS KTKTLSGG I KVNGPRLESLVLTY 
INAI SRGDL PCMENAVLALAQ I ENS AAVQKAIAHYDQQMGQKVQ 
LPAETLQELLDLHRVS EREATEVYMKNS FKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEAS SDRCSALLQVI FS PLEEEVKAG I YS KPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQILTEKEKEIEVECVKABSAQASAKMVEEMQIKYQQMMEEK 
EKSYQEHVKQLTEKMERERAQLLEEQEKTLTSKLQEQARVLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6266 


276 


1421 " 


GSHQKQMLVPCFLYSLQNRKPSLYGSLTCQGIGLDG I PEVTASE 
GFTVNE INKKS IHI SCPKENASS KFLAPYTTFSRIHTKS ITCLD 
ISSRGGLGVSSSTDGTMKIWQASNGELRRVLEGHVFDVNCCRFF 
PSGLWLSGGMDAQLKI WS AEDASCWTFKGHKGG I LDTAI VDR 
GRNVVSASRDGTARLWDCGRSACLGVLADCGSS INGVAVGAADN 
SINLGSPEQMPSEREVGTEAKMLLLAREDKKLQCLGLQSRQLVF 
LFIGSDAFNCXTFLSGFLIiLAGTQDGNIYQLDVRSPRAPVQVIH 
RSGAPVLSLLSVRDGFIAS0/3DGSCFIVQQDLDYVTELTGADCD 
PVYKVATWEKQIYTCCRDGLVRRYQLSDL 


£267 


3 


622 


IX5MMKKNNSAKRGPQDGNQQPAPPEKVGWVRKFCGKG I FREIWK 
NR YVVLKGDQLYI SE KEVKDEKNIQE V FDLSDYEKCE ELRKS KS 
RSKKNHSKFTTiAW55Tfni>^5KITaDiMT.T n*T.n wc-dtttvt^otj -i-*t*\ t * 

nLiAium^AC X AXrtTLOI\.^f X/iATfjjl J? Lift Voi J .E».E»l\iiO W 1NALNSA 

ITRAKNRILDEVTVEEDSYLAHPTRDRAKIOHSRRPPTRGHLMA 
VASTSTSDGMLTLDLIQEEDPSPEEPTSLC 


6268 


160 


1368 


HRELCQNLPAGLSSALI DNPLTLLLS XDTYVMLQEPVTFQDVAV 
DFSREEWGLLGPTQRTEYRDVMLETFGHLVSVGWETTLENKELA 
PNSD I PBEEPAPSLKVQES SRDCALSSTLEDTLQGGVQE VQDTV 
LKQMESAQEKDLPQKKHFDNRESQANSGALDTNQVSLQKIDNPE 
SQANSGALDTNQVLLHKI PPRKRLRKRDSQVKSMKHNSRVKIHQ 
KSCERQKAKEGNGCRKTFSRSTKQITFIRIHKGSQVCRCSECGK 
IFRNPRYFSVHKKIHTGERPYVCQDCGKGFVQSSSLTQHQRVHS 
GERPFECQECGRTFNDRSAISQHLRTHTGAKPYKCQDCGKAFRQ 
SSHLIRHQRTHTGERPYACNKCGKAFTQSSHLIGHQRTHNRTKR 
KKKQPTS 


6269 


2886 


1449 


HASAPTRRNMAAAS PLRDCHAWKDARLPLSTTSNEACKLFDATL " 
TQ YVKWTNDKS LGG I EG CLS KLKAADPTFVMGHAMATGLVLI GT 
GSSVKLDKELDLAVKTMVE ISRTQPLTRREQLH VSAVETFANGN 
FPKACE LWEQI LQDHPTDMLALKFSHDAY F YLG YQEQMRDS VAR 
I YP FWTPDI PIjSSYVKG I YS FGLMETNPYDQAEKLAKEALS INP 
TDAWS VHTVAH I HEMKAE I KDGLEFMQHSETLWKDSDMLACHNY 
WHWALYLIEKGEYEAALTIYDTHILPSLQANDAMLDVVDSCSML 
YRLQMEGVS VGQRWQD VLP VARKHSRDHI LLFNDAHFLMASLGA 
HDPQTTQELLTTLRDAS ES PGENCQHIiLARDVGLPLCQALVKAE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DGNPDRVLELLLP I R YRI VQLGGSNAQRDVFNQLL I HAALNCTS 
SVHKNVARSLLMERDALKPNSPLTERLI RKAATVHLMQ 


6270 


23 


2086 


SVTVTLGSEGDGRPPTYHLEEMEQEPQNGEPAEIKIIREAYKKA 
FLFVNKGLNTDELGQKEEAKNYYKQGIGHLLRGISISSKESEHT 
G PGWESARQMQQKMKETLQNVRTRIiE I LE KGLATS LQNDLQE VP 
KLYPEFP P KDMCEKIiPEPQS FSSAPQHAE VNGNTSTPSAGAVAA 
PAS LS LPSQSCPAEAPPAYTPQAAEGHYTVS YGTDSGEFSS VGE 
E FYRNHSQPPPLETLGLDADEL I LI PNGVQ I FFVNPAGEVS APS 
YPGYLRIVRFIJDNSLDTVLNRPPGFLQVCDWLYPLVPDRSPVLK 
CTAGAYMFPDTMLQAAGCFVGWLSSELPEDDRELFEDLLRQMS 
DLRLQANWNRAEEENEFQIPGRTRPSSDQLKEASGTDVKQLDQG 
NKDVRHKG KRG KRAKDTSSEEVNLSH I VP CEPVPEE K P KELPEW 
S E KVAHN I LSGAS WVS WGLVKG AE I TGKA IQKGASKLRER IQPE 
E KP VEVS P AVTKG I* Y I AKQATGG AAKVSQFLVDGVCTVANC VGK 
ELAPHVKKHGS KLVPES LKKDKDG KS PLDGAMVVAAS S VQG FST 
V^QG LECAAKCIVNNVSAETVQTVRYKYGYNAGEATHHAVDSAV 
NVGVTAYN I NN IG I KAM VKKTATQTGHTLLED YQI VDNSQRENQ 
EGAANVNVRGEKDEQTKEVKEAKKKDK 


6271 


32 


1058 


GCG VKTAGMVG RE KELS IHFVPG S CRLVEEEVNI PNRRVLVTGA 
TGLLGRAVHKEFCX3NNWHAVGCGFRRARPKFEQVNLLDSNAVHH 
IIHDFQPHVIVHCAAERRPDVVENQPDAASQLNVDASGNIAKEA 
AAVGAFLI YI SSDYVFDGTNPP YREEDIPAPIaNLYGKTKLDGKK 
AVLENNLGAAVLRI P I LYGE VEKLEES AVT VM FDKVQFSNKSAN 
MDHWQQR FPTHVKDVAT VCRQLAE KRMLDPS I KGTFHWS GNEQM 
TKYEMACAIADAFNLPSSHLRPITDSPVLGAQRPRNAQLDCSKI* 
ETLGIGQRTPFRIGIKESLWPFLIDKRWRQTVFH 


6272 


1136 


S28 


G AVMEDAAAPGRTEG VLERQGAP P AAGQGGALVELTPTPGGLAL 
VS P YHTHRAGD PLDLVALAEQVQKADEFIRANATNKLTVT AEQI 
QHLQEQARKVLEDAHRDANLHHVACNIVKKPGNIYYLYKRESGQ 
QYFS I ISPKEWGTSCPHDFLGAYKLQHDLSWTP YEDIEKQDAKI 
SMMDTLLSQSVAIiPPCTEPNFQGLTH 


6273 


256 


843 


SCPRVSPECRSLGCQVMFSLPI*NCSPDHIRRGSCWGRPQD1,KIA ' 
SAAWNS KCHPG AGAAMARQHARTLW Y0RPRYVFMEFCVEDS TDV 
HVLIEDHRIVFSCKNADGVEbYNEIEFYAKVNSKDSQDKRSSRS 
ITC F VRKWKEKVAW PRLTKEDI KP VWLS VDFDNWRDWEGDEEME 
LAHVEHYAEVRDNTYCVLPT 


6274 


56 


1142 


AAAAMAAAAGGGAGAARSLSRFRGCIAGAbLGDCVGSFYEAHDT 
VDLTS VLRHVQS LE PDPGTPGS ERTE AIiYYTDE)TAMARAI>VQSIj 
LAKEAFDEVDMAHRFAQEYKJ03PDRGYGAGVVTVFKKLLNPKCR 
DVFEPARAQFNGKGSYGNGGAMRVAGISLAYSSVQDVQKFARLS 
AQLTHAS SLG YNGAI IjQALAVHLALQGES S S KHFLKQLLGHMED 
LEGDAQSVLDARELGMEERPYSSRLKKIGELLDQASVTREEWS 
ELGNGIAAFESVPTAI YCFLRCMEPDPE I PS AFNSLQRTLI YS I 
S LGGDTDT I ATMAGAIAGAY YGMDQ VPES WQQS CEG YEETD I LA 
QSLHRVFQKS 


6275 


20 


565 


SRRGRARCLARGSRRPVPRPAKTMAFMVKTMVGGQLKNLTGSLG " 

GGEDKGDGDKSAAEAQGMSREEYEEYQKQLVEEKMERDAQFTQR 

KAERATLRSHFRDKYRLPKNETDESQIQMAGGDVELPRELAKMI 

EEDTEEEEEKASVLGQLASLPGLNLGSLKDKAQATLGDLKQSAE 

KCHVM 


6276 


797 


97 


TLLPLPPLPDTEGM I LLNTGLEGT VAENPVP I VHTPSGNILTLE 
SCI^IATHPGHWGIHLQIAEPAAI^PSLALiARLSSLGLLHWP 
VWVGAKISHGSFSVPGHVAGRELLTAVAEVFPHVTVAPGWPEEV 
LGSG YREQLLTDMLELCQGLWQ P VS FQMQAMLLGHS TAGAI GRL 
LASS PRATVTVEHNPAGGD YAS VRTALLAARAVDRTRVYYRLPQ 
GYHKDLLAHVGRN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6277 


4600 


2744 


MAFRTE MGLY YS YFKT I VE APS FLNG VWM IMNDKLTE YP LVI NT 
LKRFNLYPE VI 1ASWYRI YTKIMDLIG IQTKiaf TVTI GEG t*SP 
TBS CEGLGDPACFYVAVI FI LNGLMMALFFI YGT YLSGSRLGGL 
VTVLCFFFNHGECTRVMWTPPLRES FSYPFLVLQMLLVTHILRA 
TKL YRGSLI ALC I SNVF FML P WQFAQFVIXTQ I ASLFAVYVVGY 
IDICKLRKI 1 YIHMISLALCFVLMFGNSMLLTSYYASSLVI I WG 
ILAMKPHFLK1NVSELSLWVIQGCFWLFGTVILKYDTSKIFGIA 
NDAH I GNLLTS KrFSYKDFDTLLYTCAAEFDFMEKETFLRYTKT ■ 
LLLPWLVGFVAIVRKIISDMWGVLAKQQTHWKHQFDHGEDVY 
HALQLLAYTALG I LIMRLKLFLTPHMCVMASLT CSRQLFGWLFC 
KVHPGA I VFAI LAAMS I QGSANLQTQWNT VGEFSNLPQEELI BW 
I KYSTKPDAVFAGAMPTMASVKLSALRP I VNHPH YEDAGLRART 
KIVYSMYSRKAAEEVKRELI KLKVNYYILEESWCVRRSKPGCSM 

PEIWDVEDPANAGKTPLCNLLVKDSKPHFTTVFQNSVYKVLEW 
KB 


6278 


3 


823 


ILFRIiVLLSLVYLLNSVATEERKPAEVblVEGQQYAWGTVLLL 
IRI ILE YCQGVDNI PSVTTDMLTRLSDLLKYFNSRS CQLVLGAG 
ALQWGLKTITTKNIiALSSRCbQblVHYIPVIRAHFEARLPPKQ 
YSMLRHFDHITKDYHDHIAEI SAKLVA I MDSLFDKLLS KYEVKA 
PVPSACFRNICKOMTKMHEAIFDLLPEEQTQMLFLRINASYKLH 
L KKQLS HLNV I NDGG PQNGL VTAD VAF Y TGNLQ AL KGLKDLDLN 
MAEIWEQKR 


6279 


127 


1687 


GGAMASDGARKQFWKRSNSKLPGSIQHVYGAQHPP FDPLLHGTL 
LRSTAKM PTTP VKAKRVSTFQE FESNTS DAWDAGEDDDELLAMA 
AE S LN S E VVMET ANR VLRNHS Q RQGR PTLQE GPGLQQ K PR P EAE 
PPSPPSGDLRLVKSVSESHTSCPAESLASDAAPLQRSQSLPHSAT 
VTLGGTSD PSTLSS SALS EREASRLDKFKQL LAG PNTDLEELRR 
LS WSG I PKPVRPMT W KLLSG YLPANVDRRPATLQRKQK3 YFAFI 
EHYYDSRNDEVHQDT YRQIH ID I PRMS PEALI LQPKVTEI FERI 
LF I WAIRHPASG YVQGINDLVT P FF VVFICE YIEAE E VDTVDVS 
G VPAE VLCNI EADT YWCMS KLLDG IQDNYTFAQPG I QMKVKMLE 
ELVSRIDEQVHRKLDQHBVRYLQFAFRWMNNLLMREVPLRCTI R 
LWDTYQS E PDGFSHFHL Y VCAAFLVRWRKB I LEEKD FQE LLLFL 
QNLPTAHWDDEDISLLLAEAYRIiKFAFADAPNHYKK 


conn 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEEEEEDE " 
DVDLAQV^YLLRRGQVRLVQGGGAANLQFIQALLDSEEENDRA 
WDGR LGDR YNPP VDATPDTRELE FNE I KTQVE LATGQLGLRRAA 
QKHS FPRMLHQRERGLCHRGS FSLGEQSRVI SHFLPNDLGFTDS 
YSQKAFCG I YSKDGQ I FMSACQDQTI RLYDCRYGRFRKFKS I KA 
RDVGWSVLDVAFTPEfGNHFLYSSWSDYIHICNIYGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
I ESHEDDVNAVAFAD I S SQ I LFSGGDDAICKVWDRRTMREDDPK 
PVGALAGHQDGITFI DSKGDAR YLI SNSKDQTIKLWDIRRFSSR 
EGMEASROAATOONWDYR WOOV P K"fT AWR KT .V T .pftnc c t.mt vumj 
GVLHTLIRCRFS PI HS TGQQ FI YS GCS TGKWVYDUjSGH I VKK 
LTCmKACVRDVSWHPFEEKIVSSSWDGNLRLWQYRQAEYFQDDM 
PESEECASAPAPVPQSSTP FSS PQ 


6281 


B57 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEBEEEDE~~ 
DVDIiAQVIAYLIJiRGQVRLVQGGGAANLQFIQALLDSEEENDRA 
WDGRLGDRYNPPVDATPDTRELEFNEI KTQVBLATGQLGLRRAA 
QKHSFPRMLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YSQKAFCG I YSKDGQI FMSACQDQTI RLYDCRYGRFRKFKS I KA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHICNIYGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
I ESHEDDVNAVAFAD I S SQ I L FSGGDDAI CKVWDRRTMREDDPK 
PVGALAGHQDG ITFI DSKGDARYUISNSKDQTI KLWDIRRFSSR 
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SEQ 
ID 
NO: 


Predicted 
beg i nn i ng 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nil r* 1 ^r*\fr* H r? 

location 
cor re spond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

( A — A "1 an'j rte* f"* — ci H •< np T~) — 2X«STt£* T—t - "l f* Tkr' i 17 — 

Glutamic Acid, F= Phenylalanine, G=Glycine / 
H=Histidine, It=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T^Threonine, v=Valine, 
W=Tryptophan, Y=» Tyrosine, X— Unknown , *=Sfcop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 
GVLHTLIRCRFSPIHSTGQQFIYSGCSTGKWVYDLLSGHIVKK 
IiTNHKAC VRDVS WHP FEEKI VS S S WDGNLRLWQ YRQAE YFQDDM 
PESEECASAPAPVPQSSTPFSSPQ 


6282 


12S 


906 


FVTNTTKES KQDLLERLRKLE FD ISEDE I FTSLTAARS LLERKQ 
VRPMLLVDDRALPDFKG IQTSDPNAVVMGLAPEHFHYQ ILNQAF 
RLLLDGAPLIAIHKARYYKRKDGLAIiGPGPFVTALEYATDTKAT 
WGKPEKTFFLEALRGTGCEPEEAVMIGDDCRDDVGGAQDVGML 
GILVKTGKYRASDEEKINPPPYLTCESFPHAVDHILQHLL 


6283 


140 


1043 


LSLFGIHVMNPFWSMSTSSWKPJSEGEEKTLTGDVKTSPPRTAP 
KKQLPS I PKNALPITKPTS PAPAAQSTNGTHAS YGPFYLEYSLL 
AE PTLWKQ KLPG VYVQPS YRSALM WFGVI F I RHGLYQDGVFKF 
TVYI PDNYPDGDCPRLVFDI PVFHPLVDPTSGELDVKRAFAKWR 
RNHNHIWQVLMYARRVFYKIDTASPLNPEAAVLYEKDIQLFKSK 
WDS VKVCTARLFDQPKI EDPYAI SFS PWNPSVHDEAREKMLTQ 
KKKPEEQHNKS VHVAGLS WVKPGS VQ P FSKEEKTVAT 


6284 


1 


2879 


RSVI PGSTI SSRWPGLSRPRFMAAHEWDWFQREELIGQISDIRV 
QNLQ VERENVQKRTFTR W I NLHLEKCN PPLE VKDL FVDIQDGKI 
LKALLE VLSGRNLLHE YKS S SHR I FRLNNIAKALKFLEDSNV KL 
VSIDAAEIADGNPSLVLGLIWNIILFFQIKELTGNLSRNSPSSS 
LAPGSGGTDSDSS FPPTPTAERSVAI S VKDQRKAI KALLAWQR 
KTRKYGVAVQDFAGS WRSGLAFLAVI KAIDPS LVDMKQ ALENST 
RENLEKAFS I AQDAUII PRLLEPEDI MVDTPDEQS IMT YVAQFL 
ERFPELEAEDI FDSDKEVP I ESTFVR I KETPSEQESKVFVLTEN 
GERT YTVNHETSHPPPS KV F VCDKPE S M KEFRLDGVSSHAItSDS 
STEFMHQ 1 1 DQVIjQGG PG KTSDI SEPS PESS I LS SRKENGRSNS 
LP I KKTVHFEADTYKDPFCSKNt>SIiCFEGSPRVAKESnRQDGHV 
tAVEVAEEKEQKQESSKIPESSSDKVAGDIFIiVEGTNNNSQSSS 
C^GALESTARHDEESHSI>SPPGENTVMADSFQIK^7KLMTVEALE 
EGD YFEAI PLKASKFNSDLI DFASTSQAFNKVPSPHETKPDEDA 
EAFENHAEKLGKRS I KSAHKKXDS PEPQVKMDKHEPHQDSGEEA 
EGCPSAPEETP VDKKPEVHE KAKRKSTRPHYEEEGEDDDLQGVG 

YFPHYEVPLAAVLEAYVEDPEDLKNEEMDLEEPEGYMPDLDSRE 
EEADGSQSSSSSSVPGESLPSASDQVIiYLSRGGVGTTPASEPAP 
LAPHEDHQQRETKENDPMDSHQSQESPNLENIANPLEEIWTKES 
I S S KKKEKRKHVDHVESSLFVAPGS VQS SDDLE EDS SDYS I PSR 
TSHSDSS I YLRRHTHRSSESDHFSLCSVEERSRSG 


6285 


2157 


1331 


SCKTENLLEMWWFQQGLSFLPSALVI wtsaafi fs yi tavtlhii 
ID PALP Y I SDTGTVAPE KCLFGAMLNI AAVLCIAT I YVRYKQ VH 
ALSPEEKVI I KLNKAGLVLG I LS CLGLS I VANFQKTTLFAAHVS 
GAVLTFGMGSLYMFVQT I LS YQMQPKI HGKQVFW I RLLLVI WCG 
VSALSMLTCSSVLHSGNFGTDLEQKLHWNPEDKGYVLHMITTAA 
EWSMSFS FFGFFLT Y I RDFQKI SLRVEANLHGLTLYDTAPCPIN 
NERTRLLSRDI 


6286 


1619 


276 


KAGASCCGSANP YVS VGKS C VLLAMAQLQTRF YTDNKKYAVDD V 
PFSI PAASE IADLSNI INKLLKDKNEFHKHVEFDFLI KGQFLRM 
PLDKHMEMENISSEEWEIEYVEKYTAPQPEQCMFHDDWISSIK 
GAEEWILTGSYDKTSRIWSLEGKSIMTIVGHTDWKDVAWVKKD 
SLSCLLLSASMDQTILLWEWNVERNKVKALHCCRGHAGSVDSIA 
VDGSGTKFCSGSWDKMLKIWSTVPTDEBDEMEESTNRPRKKQKT 
EQLGLTRTPIVTLSGHMEAVSSVLWSDAEBICSASWDHTIRVWD 
VESGSLKSTLTGNKVFNCISYSPLCKRLASGSTDRHIRLWDPRT 
KDGSLVSLSLTSHTGWVTSVKWSPTHEQQLI SGSLDN1 VKLWDT 
RSCKAPLYDLAAHEDKVLS VD WTDTGLLLSGGADNKLYS YRYS P 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ami.no acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
b-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
St=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
TTSHVGA 


6287 
6288 


j 278 


1482 


mqfffnfqiglrstsgkekysgdagflgdalqlflqcLaldedf 

APAFCLQVQKILCDbLLPENLKEGLKESSWSSLPCTKNRPFDFHS 
VMEESQSLNEPSPKQSEEIPEVTSEPVKGSLNRAQSAQSINSTE 
MPAREDCLKRVSSEPVLSVQEKGVLLKRKLSLLEQDVIVNEDGR 
WKLKKQGETPNEVCMFSLAYGDIPEELIDVSDFECSLCMRLFFE 
PVTTPCGHSFCKNCLERCLDHAPYCPLCKESLKEYUUiRRycVT 
QLLEELI VKYLPDE LS E RKK I YDE ETAE LSHLTKNV P I FVCTMA 

YPTVPCPLHVFEPRYRLMIRRSIQTGTKQFGMCVSDTQNSFADY 

GCMLQ1RNVHFUPDGRSWDTVGGKRFRVLKRGMKDGYCTADIE 
YLEDV 


6283 


i 1 


743 


VTLYPCRGLVGNLLLGASGMASGCKIGPSILNSDXANLGAECLR 
MLDSGADYLHLDVMDGHFVPNITFGHPWESLRKQIiGQDPFFDM 
HMM VSKPEQWVKPMAVAGANQ YTFHLEATEN PGALI KD I RENGM ■ 
KVGLAIKPGTS VE YIAPWANQI DMALVMTVEPGFGGQKFMEDMM 
PKVHWLRTQFPSLDI BVIXXrVGPDTVHKCAEAGANM I VSGSAIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6290 




743 


VTLYPCRGLVGNLLLGASGMASGCKIGPS I LNSDLANLGAECLR 
MLDS GAD YUJIiDVMDGHFVPNITFGHP WESIjRKOLGODP FFDM 
HMMVSKPEQWVKPMAVAGANQYTFHLEATENPGALIKDIRENGM 
KVGIAIKPGTSVEYIAPWANQIDMALVMTVEPGFGGQKFMEDMM 
PKVHWLRTQFPSLDIEVDGGVGPDTVHKCAEAGANMI VSGSAIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6291 


3 


1856 


TLGRWLLGVYETVAPTLACLPRPRLRRRRRRRRRRMISRYTRKA 
VPQSLELKGITKHALNHHPPPEKLEEISPTSDSHEKDTSSQSKS 
DI TRES S FTSADTGNSLS AFPS YTGAGISTEGSSDFS WG YGELD 
QNATEKVQTMFTAIDELIiYEQKLS VHTKS LQEECQQWTAS FPHL 
RILGRQIITPSEGYRLYPRSPSAVSASYETTLSQERDSTIFGIR 
GKKLHFS SS YAHKASS IAKSSSFCSMER DEEDS I IVSEG I IEEY 
LAFDHIDIEEGFHGKKSEAATEKQKLGYPPIAPFYCMKEDVLAY 
VFDS VWCKWS CMEQI/TRSHWEGFAS DDESNVAVTRPDS ESS CV 
LSELHPLVLPRVPQSKVLYITSNPMSLCQASRHQPNVNDLLVHG 
MPI^PRNLSLMDKLLDLDDKLLMRPGSSTILSTRNWPNRAVEFS 
TSSLS YTVQSTRRRNPPPRTLHP ISTSHSCAETPRS VEE ILRGA 
RVPVAPDSLSSPSPTPLSRWNLLPPIGTAEVEHVSTVGPQRQMK 
PHGDSSRAQSAWDEPNYQQPQERLLDPDFFPRPNTTQSFLLDT 
QYRRS CAVE YPHQARPGRGSAGPQLHGS TKS QSGGRP VS RTRQG 


6292 


1732 


602 


■L. VAKMAS SASART PAGKR VI NQEEIiRRLMKEKQRI*S TSRKR I ES 
P FAKYNRLGQLS CALCNTP VKSEI^WQTHVLGKQHREKVAELKG 
AKEASOGSSASSAPQSVKRKAPDADDQDVKRAKATLVPQVQPST 
SAWTTWFDKIGKEFlRATPSKPSGLSIiLPDYEDEEEEEEEEEGD 
GERKRGDASKPLSDAQGKEHSVSSSREVTSSVLPNDFFSTNPPK 
API I PHSGS IEKAE IHEKWERRENTAEALPEGFFDDPEVDARV 

RKVDAPKDOMDKFWnPPniflVMDAUNrrT C17KTtnvcir<nnr>nnT _ 

jvavv^i iuiui h ycr yjuiWKy v w i -L i>Jc*/vJ. VAiiEDEEGRLDRO 
I GEIDEQ I ECYRR VE KJLRNRQDEI KNKLXEI LTI KELQKKEEEN 
ADSDDEGELQDLLSQDWRVKGALL 


6293 


1835 


1142 


TCPGAMKM VAPWTRF YSNSCCLCCHVRTGTI LLGVW YL»I INAW 
LLI LLS ALAD PDQ YNFSSSELGGDFEFMDDANMC I A I A I SLLM I 
LI CAMAT YGAYKQRAAW 1 1 PF FCYQ I FD FALNMLVAI TVL I Y PN 
SIQEYIRQLPPNFPYRDDVMSVNPTCIiVIjI ILLFIS I ILTFKGY 
LIS CVWNC YR Y I NGRNSSD VI,VYVTSNDTTVLLP P YDDATVNGA 
WCEPPPPYVSA 




2382 


1035 

] 


KWCTLGTVDVHPIGWCAINSKILVPPRTIHAKFTDWKGYLMKRIi " 
VGSRTIiP VD FHI KM VESMKYP FRQGMRLE WDKSQ VS RTRMAW 
yrviGGRLRLL YEDGDSDDDF WCHMWS PL IHP VG WS RR VGHG I K 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, Rt=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y -Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=po3sible nucleotide insertion) 








MSERRSDMAHHPTFRKIYCDAVPYLFKKVRAVYTEGGWFEEGMK 
IiEAI DPLNLGN I CVAT VCKVLLDG YLM I CVDGGP STDGLDWFCY 
HASSHAI FPATFCQKND1 ELT P PKGYEAQTFNWENYLEKTKS KA 
APSRLFNMDCPNHGFKVGMKLEAVDLMEPRLICVATVKRWHRL 
LS I HFDGWDS E YDQ W VDCES PD I YP VG WCELTG YQLQ P P VAAEP 
ATPLKAKEATKKKKKQFGKKRKRIPPTKTRPLRQGSKKPLLEDD 
PQGARKISSEPVPGEIIAVRVKEEHLDVASPDKASSPELPVSVE 
NIKQETDD 


6294 


354 


1814 


AQLTTRGRTVAGGVRWIPSPFPDLELYSCCLGTDRGFPELSHHC 
KNVIATASDYDMAE I TNIRPS FDVS P WAGLIGASVL WCVSVT 
VFVWSCCHQQAEKKHKNPPYKFIHMLKG IS I YPETLSNKKKI IK 
VRRDKDGPGREGGRRNLLVDAAEAGLLSRDKDPRGPSSGSCIDQ 
LP IKMDYGEELRSP I TSLTPGESKTTSPSS PEEDVMLGSLTFSV 
D YNF PK KALVVTIQE AHGLPVMDDQTQGS DP YI KMTI LPD KR HR 
VKTRVLRKTLDPVFDETFTFYG I P YSQLQDLVLHFLVLS FDRFS 
RDDVIGEVMVPLAGVDPSTGKVQLTRDI I KRNI QKCI S RGELQV 
SLSYQPVAQRMTVWLKARHI^KMDIAGLSGNPYVKVNVYYGRK 
RIAKKKTHVKKCTLNPIFNESFIYDIPTDLLPDISIEFLVIDFD 
RTTKNEWGRLI LGAHSVTASGAEHWREVCESPRKPVAKWHSLS 
EY 


6295 


279S 


617 


VSSALLTGATSGSDAAKS EGAS AS PLSCTNAVAMDRPDEG P PA K 
TRRLSSSES PQRDPPPP PPPPPLLRLPLPPPQQRPRLQEETEAA 
QVLADMRGVGLGPALPPP PPY VI LEEGGI RAYFTLGAECPGWDS 
T I E SG YGEAP P PTES LE ALPTP EASGGSLEI DFQ WQSSS FGGE 

RRRRRRRRKQRKVKRESRERNAERMESILQALEDIQLDLEAVNI 
KAGKAFLRLKRKFIQMRRPFLERRDI*! iqhi pgfwvkaflnhpr 
IS ILINRRDEDI FR YLTNLQVQDLRH 1 SMGYKMKLYFQTNP Y FT 
NMVI VKEFORNRSGRLVSHSTP IRWHRGOEPO ARRHRNHna cuq 

ffswfsnhslpeadriaeiikndlwvnpiiryylrergsrikrkk 
qemkkrktrgrcewimedapdyyavedifseisdidetihdik 
isdfmettdyfettdneitdinenicdsenpdhnevpnnettdn 
nesaddhettdnnesaddnl^npednnkntddneenpnnnenty 
gnnffkggfwgshgnnqdssdsdneadeasddedndgnegdneg 
SDDDGNEGDNEGSDDDDRDIE yyekviedfdkdqadyedvi E 1 1 
SDESVEEEGIEEG I QQDEDI YEEGN YEEEGSEDVWEEGEDSDDS 
DLEDVL.QVPNGWANPGKRGKTG 


6296 


727 


1199 


RHCGCDAQGACDSLPPTGTSSPVTARNAIPEARCCVWLLDGTTV 
EAVRPARERLARKEIjRQKRMQQFSRDSAYSSNKDSTCLLTERDT 
LGTSLQ FPSPFSGT I S FGS FS DS'G I FP LGSQCCLGFQQFS 1 SGK 
KWALIHKRVRLSVFGARWGRIYFGK 


6297 


1 


922 


QRAAAAS PSS CGPRG AE YGALMAMEG YWRFLALLGS ALLVG FLS 
VI FALVWVLHYREGLGWDGSALEFNWHPVLMVTGFVFIQGIAI I 
V YRLPWTWKCSKLLMKS IHAGLNAVAAI LAI ISWAVFENHNVN 
NIANMYSLHSWVGLIAVICYLLQLLSGFSVFLLPWAPLSLRAFL 
MPIHVYSGI VI FGTVIATALMGLTEKLI FSLRDPAYSTFPPEGV 
FVNTLGLLILVFGALI FWIVTRPQWKRPKEPNSTILHPNGGTEQ 
GARGSMPAYSGNNMDKSDSEL1WEVAARKRNLALDEAGQRSTM 


6298 


3 


985 


SVPLRRLSLSGTI^AGTTTKMAVARIAAVAAWVPCRSWGWAAV 
PFGPHRGLS VLLARI PQRAPR WLPACRQ KTSLSFLNR PDLPNLA 
YKKLKGKS PGI I FI PGYLSYMNGTKAIiAI EEFCKSLGHACIRFD 
YSG VGSSDGNS EESTLGKWRKDVLS 1 1 DDLADGPQ I LVGS S LGG 
WLMLHAA I ARPEKVVAL I GVATAADTLVTKFNQLP VELKKE VEM 
KGVWSMPSKYSEEGVYNVQYSFIKEAEHHCLLHSPIPVNCPIRL 
LHGMKDDI VPWHTSMQ VADRVLSTD VDV I LRKHSDHRMREKADI 
QLLVYTI DDLI DKLiS T I VN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


: Predicted end 
• nucleotide 

location 

corresponding 

to first 

3mi nn an' r4 

UUL111SJ CI I— _L LJ 

residue of 
amino acid 
sequence 


Anxno acid segment containing signal peptide - * 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +~Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 


6299 


512 


814 


ECDLEGIMPNVTISLSLPTNGSPLQDILVHPCVTSLDSAILTSS 
SIDAMDDSAFSGPYKFPFTPPLESFNLCFYTSQVPVPPILGFYQ 
MKEEEVQIjRMWH 


6300 


121 


692 


AAPS C WSQ RG VPAAGTPS S PRLLVS RAAAP SAG P WGAWRQGARA 

AQS P FS I PN SSS VP YGS QDS VHSSPEDGGGGRDR P VGGSPGGPR 

LVIGSLPAHLSPHMFGGFKCPVCSKFVSSDEMDLHLVMCLTKPR 

ITYNEDVLSKDAGECAICLEELQQGDTIARLPCLCIYHKGCIDE 
WFEVNRSCPEHPSD 


6301 


616 


284 


GKFVPVNWE PPQPbPFPKYLRCYRCLLETKELGCLLGSDI CLTP 
AGSSCI TLHKKNSSGSDVMVSDCRS KEQMSDCSNTRTS PVSGFW 
IFSQYCFLDFCNDPQNRGLYTP 


6302 


490 


745 


IFGFLHLFHMEHSFLLVCALFAHVFFSSSCGSSVAIiHSDPCLLS 
PVIiLNCtiPGDIiRPLDEliYAQKLKYKAI SEELDHALNDMTS L 


6303 


2 


1961 


YKNEYGGGLLWQSWQEKHPGQAI^SEPWNFPDTKEEWEQHYSOE - 
YWYYLEQFQYWEAGGWTFDASQSCDTDTYTSKTEADDKNDEKCM 
KVDLVSFLSSPIMGDNDSSGTSDKDHSEILDGISNIKLNSEEVT 
QSQLDSCTSHDGHQQLSEVSSKRECPASGQSEPRNGGTNEESNS 
SGNTNTDP PAEDS QKSSGANTSKDR PHASGTDGDESEEDP PEHK 
PSKLICRSHELD IDENPASDFDDSGSLLGFKYGSGQKYGG I PNFS 
HRQVRYXiEKOTiO^KSKYIJJMRRQIKMKNKHIFFTKESEKPFFKK 
SKILSKVEKFLTOVNKPMDEEASQESSSHDNGHDASTSCDSEEQ 
DMS VKKGDDLLETNNPEPEKCQSVSSAGELETEN YERDS LLATV 
PDEODCVTQE VPDSRQAETEAEVKKKKNKKKNKKVNGLP PE I AA 
VPELAKYWAQRYRLFSR FDDG IKLDREGWFSVTPEKI AEH I AGR 
VSQS FKCDVWDAFCGVGGNTIQFALTGMR VI AI DID P VK I ALA 
RNNAEVYGIADKIEFICGDFIjLXjASFI.KAD\A^FLSPPWGGPDYA 
TAETFDIRTMMSPDGFEIFRJLSKKITNNIVYFLPRrJADIDQVAS 
LAGPGGQVE IEQNFLNNKLKTITAYFGDLI RRPASET 


6304 


1 


1438 


HRARVDRSRESPGGDLRHPGRVRRDITLSGHPRLSTQHVVLLRE" 

DEVGDPGTFCDLGHPQHGS P IQETQSE VVTLVSPLPGSDMAAIjPA 

WRATSGLTLWPHTAEGRDLLGAENRAI.TGGQQAEDPTI*ASGAYQ 

WPGSVEKLQGSW7CDAETLLSSSRTGGQAPPWLTDHDVQMLRLL 

AQGEVVDKARVPAHGQVLQVGFSTEAAI.QDLSSPRLSQLCSQGL 

CGLI KR PGDLPE VLS FHVDRVLGLRRS h PAVARRFHS PLLPYRY 

TDGGAR PV I WWAPDVQHLSDP DEDQNS LAIX3 WLQYQAIjLAHS CN 

WPGQAPCPGIHHTEWARLALFDFLLQVHDRLDRYCCGFEPEPSD 

PC VEE RLREKCRNP AEIjRLVH I LVRS SDPSHLVYI DNAGNLQH P 

EDKI^FRLLEGIDGFPESAViCVLASGCLONMLLKSLQMDPVFME 

SG^GAQGLKQVbCrriiEQRGQVI.LGHIQKHNLTLFRDEDP 


6305 


99 


420 


WMIWRGRSTYRPRPRRSVPPPEI.IGPMLEPGDEEPQQEEPPTES 
RDPAPGQEREBDQGAAETQVPDLEADIiQELSQSKTGDECGDGPD 
VQGK I LTKS EQFKM P EGR 


6306 


1 


1874 


PTRPSKVKVPHTFLIHSYTRPTVCQACKKLLKGLFRQGLQCKDC 
KFNCHKRCATRVPNDCLGEAI>INGDVPMEEATDFSEADKSAIjMD 
ESSDSGVIPGSHSENALHASEEEEGEGGKAQSSLGYIPLMRWQ 
SVRHTTRKSSTTLREGWVVHYSNKDTLRKRHYWRLDCKCITLFQ 
NNTTNRYYKE I PLSE I LTVESAQNFSLVPPGTNTPHCFEI VTANA 
TYFVGEMPGGTPGGPSGQGAEAARGWETAIRQALMPVILQDAPS 
APGHAPHRQ AS LS I S VSNSQ I QENVDIATVYQ I FPDE VLGSGQF 
GVVYGGKHRKTGRDVAVKVIDKIiRFPTKQESQLRNEVAILQSLR 
HPGIVWLECMFETPEKVFVVMEKLHGDMLEMILSSEKGRLPERL 
TKFLITQILVALRHLHFKNIVHCDLKPENVLLASADPFPQVKLC 
DFGFARI IGEKSFRRSWGTPAYLAPEVIXNQGYNRSLDMWSVG 
VIKIYVSLSGTFPFNEDEDINDQIQNAAFMYPASPWSHISAGAID 
LINNLLQVKWRKRYSVDKSLSHPWI^EYQTT'TIjDIjRELEGKMGER 
Y I THESDDAR W EQFAAEHPL»PGSGL PTDRDLGGACPPQDHDMQG 
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SEQ 
ID 
NO: 


Predicted : 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M= Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LAERISVI* 


6307 


2136 


589 


CFLLPRGRDPE PPE AGAAAPCAPGA PDMS FR K V VRQS KFRH V FG 
QPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFL 
VLPbSKTGRI DKAYPTVCGHTG PVLDIDWCPHNDEVI ASGSEDC 
TVMVWQI PENGLTSPLTEP VWLEGHTKRVG 1 1 AWHPTARNVLI> 
SAGCDNVVI,! WNVGTAEELYRLDSLHPDLI YNVSWNHNGS LFCS 
ACKDKSVRI I DPRRGTLVAERE KAHEGARPMRAI FI±ADGKVFTT 
GFSRMSERQLALWDPENLEEPMALQELDSSNGALLPFYDPDTSV 
VYVCGKGDSS 1RYFE ITEEPPYI HFLNTFTSKE PQRGMGSMPKR 
GLEVS KCE IARFYKLHERKCEP I VMTVPRKSDLFQDDLYPDTAG 
PEAAIiEAEEVn/SGRDADP I LISLREAYVPSKQRDLK1SRRNVLS 
DSRPAMAPGS SH LGAP AS TTrAADATPSGSIiARAGEAG K1»E EVM 
QELRALRALVKEQGDRI CRLEEQLGRMENGDA 


6308 


2 


1118 


GRPTRPEKMLLSLVLHTYSMRYIiLPSWLLGTAPTYVLAWGVWR 
ULSAFIiPARFYQALDDRL YCVYQSM VLFFFEN YTGVQ I LLYGDI/ 
PKNKENI I YLANHQST VDW I VAD I IiAIRQNALGH VR YVLKEGLK 
WLPLYGW YFAQHGG I YVKRS AKFNEKEMRNKLQS Y VDAGTP M Y3j 
VIFPEGTRYNPEQTICVI^SASQAFAAQRGLAVLKHVLTPRIKATH 
VAFDCMKNYLDAIYDVTWYEGKDDGGQRRESPTMTEFLCKECP 
KIHIH IDRI DKKDVPEEQEHMRR WLHERFEI KDKMIil EF YE S PD 
PERRKRFPGKSVNSKI*SIKKT1»PSMLIIjSGLTAGM jMTDAGRKL 
YVNTWIYGTLLGCLWVTIKA 


6309 


220 


563 


LVAEVKE PCS LPMLS VDMENKENGS VGVKNSMENGRPPD p ad wa 
VMDWKYFRTVGFEEQASAFQEQEI DGKSLLLMTRNDVLTGLQL 
KLGPAL.KIYEYHVKPU3TKHLKNNSS 


6310 


*~ 36 


979 


RPRCWKPLiIIiS^iVNCETLRIGKAWPOSSGOERYWTPRTHSSASS 
AQRGSIASLNVAAAGLWADCDQPLYDCPMCGLICTI^YHILQEHV 
DLHLE ENS FQQGMDRVQCSGDLQLiAlIQliQQEEDRKRRS EES RQE 
IE EFQKLQRQYGLDNSGGY KQQQLRNME I EVNRGRM P PS EFHRR 
KADMMESIiAIiGFDDGKTKTSGI I EALHRYYQNAATDVRRVWIjSS 
VVDHFHSS IXjDKGWGCGYRNFQMLIjSSLLQNDAYNDCIjKGML I P 
CIPKIQSMIEDAWKEGFDPQGASQLIIRLQGTKAWIGACEVYII, 
LTSLRV 


6311 


I 1 


675 


P\WWNSCEGPRIJUUUyR.TGHGVGRRARLACLGEPRVKAAVKIiTJb 
ASKLKRDDGIiKGSRTAATASDSTRRVSVRDKLLVKBVAEIiEANIj 
PCTCKVHF PDPNKLHCFQLTVTPDEG YYOGG KFQFETEVPDAYW 
MVPPKVKCLTKIWHPNITETGEICLSIjLREHSIDGTGWAPTRTL 
KDVVWGLNSLFTDIJL^FDDPIiNIEAABHHLRDKEDFRNKVDDYI 
KRYAR 


6312 


213 


1400 


gdelvkreagmkmlpgvgvfgtgssarvlvpllraegftvealw 
gkteeeakqlaeemkiafytsrtddillhqdvdlvcis I PPPLT 
rqisvkalgigknwcekaatsvdafrmvtasryypqlmslvgn 
vlrflpafvrmkqli sehyvgavmi cdari ysgsllsps ygwi c 
deu4gggglhtmgtyxvdllthltgrraekvhgllktfvrqnaa 
irgirhvtsddfcffqmlmgggvcstvtlnfkmpgafvhevmvv 
gsagrlvargadlygqknsatqeeiillrdslavgaglpeqgpqd 
vpllylkgmvymvqalrqsfqgqgdrrtwdrtpvsmaasfedgl 
ymqswdai krs srsge weavevltee pdtnqnlcealqrnni* 


6313 


2 


2071 


QRSGAARJuAFLPSPFS PACVHRS PLSFHGCWFYFVWFMPLGVL 
FHRRRAHGCTLS CSS FVEQ PTAMEAEETMECLQEFPEHHKM ILD 
RLNEQREQDRFTDITLIVDGHHFKAHKAVLAACSKFFYKFFQEF 
TQEPLVE I EGVS KMAFRHL I EFTYTAKLMIQGEEE ANDVWKAAE 
FLQMLEAI KAIjEVRNKENSAPLEENTTGKNEAKKRKIAETSNVI 
TESLPSAESEPVEIEVE I AEGTI EVEDEGIETLEEVASAKQSVK 
YI QSTGSSDDSAIiAI*I*AD I TSKYRQGDRKGQI KEDGCPS D PTS K 
QVEGIEIVELQLSHVKDLFHCEKCNRSFKLPYHFKEHMKSHSTE 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 
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SEQ 
ID 
NO: 


~ Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S FKCE I CNKR YLRES AW KQHLNC YHLEEGG VS KKQRTG KK I H VC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYECQVCNSVTNSWDQFKDHLVIHTGDK 
PNHCTLCDLWFMQGNELRRHLSDAHNISERLVTEEVLSVETRVQ 
TEPVTSMT1 1 EQVGKVHVLPLLQVQVDSAQVT VEQVHPDLLQDS 
QVHDS HMS EL P EQ VQ VS YLE VGR I QTEEGTE VHVEELHVE RVNQ 
MPVEVQTELLEADLDHVTPEIMNQEERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6314 


2 


2071 


qrsgaarlaflpspfspacvhrsplsfhgcwfyfvwfmplgvl 
fhrrrahgctiiscssfveqptameaeetmeclqefpehhkmild 
rlneqreqdrftditlivdghhfkahkavlaacskffykffqef 

TOEPl»VEIEGVSKMAPPHT.TP"P" r PVTairT m t nr> tr t» c 7\ mm n.i v t> r-> 

flqmleaikalevrnkensapleenttgkneakkrkiaetsnvi 
teslpsaesepveibveiaegtievedegietleevasakqsvk 

YIQSTGSSDDSAIAIiLADITSKYRQGDRKGQIKEDGCPSDPTSK 

Qvegieivelqlshvkdlfhcekcnrsfklfyhfkehmkshste 
sfkceicnkrylresawkqhlncyhleeggvskkqrtgkkihvc 
qycekqfdhfghfkehlrkhtgekpfecpncherfarnstlkch 
ltacqtgvgakkgrkklyecqvcnsvfnswdofkdhlvihtgdk 

PNHCTLCDLWFMnnMP , T.'Pi7i-rr QhauMTecDT^wDpm 

f "* ,v * w\»i/unr 4^S^V?i>iliJjivKxxijOlJArlXv A o t«r\J_> V 1 VliSVETRVQ 

TEPVTSMTI I EQVGKVHVLPLLQVQVDSAQVTVEQVHPDLLQDS 
QVHDSHMSELPEQVQVSYLEVGRIQTEEGTEVHVEELHVERVNQ 
MPVEVQTELLEADLDHVTPE IMNQEERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6315 


1 


1015 


I^riAVNVVTTLVLISYCPTATEEAPYWTYLLCALGLFIYQSLDA 
IDGKOARRTNSCSPI/3PT.Pnwnn^OT cn/PMnirr' t\ c> t ?\ n t->t /^mv 

PDWFFSCSFIGMFVFYCAHWQTYVSGMLRFGKVDVTEIQIALVI 
VFVLSAFGGATMWDYTI PI LEI KLKI LPVLGFLGGVI FSCSNYF 
HVILHGGVGKNGSTIAGTSVLSPGLHIGLIIILAIMIYKKSATD 
VFEKHPCLYILMFGCVFAKVSQKLWAHMTKSELYLQDTVFLGP 
GLLFLDQYFNNF IDEYWLWMAMVISS FDMVI YFSALCLQISRH 
LHLNI FKTACHQAPEQVQVLSSKSHQNNMD 


6316 


1503 


792 


VSAGAGTG IMGGTTSTRRVTFEADENEN I TWKG I R LS ENVI DR 
MKESSPSGSKSQRYSGAYGASVSDEELKRRVAEELALEQAKKES 
EDQKRLKQAKELDRERAAANEQLTRAILRERICSEEERAKAKHL 
ARQLEEKDRVLKKQDAF YKE QLARLEERSS E FYRVTTEQ YQKAA 
EEVEAKFKRYESHPVCADLQAKILOCYRENTHQTLKCSALATQY 
MHCVKHAKQSMLEKGG 


6317 


102 


839 


PKAQTS AV LARE KGHLP TMRH EAPMQMAS AQDAR YG Q KDS S DQN 
FDYMFKLLI IGNSSVGKTS FLFRYADDS FTS AFVSTVGI DFKVK 
TVFKNEKR I KLQ I WDTAGQERYRTITTAY YRGAMGF I LM YDI TN 
EESFNAVQD WS TQ I KT YSWDNAQVILVGNKCDMEDERVI STERG 
QHLGEQLGFEFFETSAKDNINVKQTFERLVDIICDKMSESIiETD 
PA I TAAKQNTR LKETP PP PQ PN CAC 


6318 


1765 


733 


PWHPLRTLPLHHPHPRPPRAEGREGADSMSHLPGLELRRBAPPL 
I^PIiSPFPLPAGStfl^QMIJiSSlJlFPITNSAGAPCKAAGRDINI 
LAP VRRDR VLAELP Q CXiRKEAALHGHKDFHPRVTCACQEHRTGT 
VGFKISKVIWGDLSVGKTCLINRFCKDTFDKNYKATIGVDFEM 
ERFEVLG I PFSLQLWDTAGQERFKCI ASTY YRGAQAI I IVFNLN 
D VASLEHT KQWLADALKENDPS S VLLFLVGS KKDLSTPAQYALM 
EKDAIX>VAQEMKAEYWAVSSLTGENVREFFFRVAALTFEANVLA 
ELEKSGARRIGDWRINSDDSNLYLTASKKKPTCCP 


6319 


88 


717 


AATMRLNQNTLLLGKKWLVPYTSEHVPSRYHEWMKSEELQRLT 
ASEPLTLEQEYAMQCSWQEDADKCTFIVLDAEKWQAQPGATEES 
CMVGDVNLFLTDLEDLTTGEIEVMIAEPSCRGKGLGTEAVLAML 
SYGVTTLGLTKFEAKIGQGNEPSIRMFQKLHFEQVATSSVFQEV 



485 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine , T^Threonine , V=Val ine , 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLRLTVSESEHQWLLEQTSHVEEKPYRDGSAEPC 


6320 


90 


1111 


RPRTGREKVAMAAVDS FYLLYRE I ARSCNCYMEAIALVGAWYTA 
RKSI TVI CDFYSLI RI^FIPRLGSRADLIXQYGRWAVVSGATOG 
IGKAYAEELASRGLNI ILISRNEEKLQWAKDI ADTYKVETDI I 
VADFSSGRE I YLP I REALKDKDVG I LVNNVGVFYP YPQ YFTQLS 
BDKLWDI INVNIAAASLMVHWLPGMVERKKGAIVTISSGSCCK 
PTPQ LAAFS ASKAYIjDHFS RALQ YE YASKG 1 FVQSL I P F YVATS 
MTAPSNFLHRCS WLVPSPKVYAHHAVSTLG IS KRTTGYWSHS IQ 
FLFAQYMPEWLWVWGANILNRSLRKEAIjSCTA 


6321 


141B 


341 


HRKAALGALMAGRLIX5KALAAVSLSLALASVTIRSSRCRGIQAF 
RNSFSSSWFHLNTNVMSGSNGSKENSHNKARTSPYPGSKVERSQ 
VPNEKVG WLVE WQD YKP VE YTAVS VLAG PRWADPQ I S ESNFS PK 
FNEKDGHVERKSKNGLYEIENGRPRNPAGRTGLVGRGLLGRWGP 
NHAADPI I TR WKRDS SGNKI MHP VSGKH1 LQFVAI KRKDCGEWA 
I PGGMVDPGEKI S ATLKRE FGEEALNS LQKTS AEKRE I EE KLHK 
IiFS QDHLV I Y KG YVDDPRNTDNAWMETEAVN YHDETGE I MDNLM 
LEAGDDAGKVKWVDINDKLKLYASHSQFIKLVAEKRDAHWSEDS 
EADCHAL 


6322 


2047 


1083 


NQE I LKNVES SRTVQPH FLEFLLS LGWSVDVGRH PGWTGHVSTS 
WSINCCDDGEGSQQEEVI SSEDIGAS I FNGQKKVL YYADALTE I 
AFWPSPVESLTDSLESNISDQDSDSNMDIjMPGIIiKQPSIiTLEIj 
FPNHTDNItNSSQRLS PSSRMRKLPQGRPVP PLGPETRVSWWVE 
RYDDIENFPLSELMTEISTGVETTANSSTSLRSTTLEKEVPVIF 
IHPLNTGIjFR I KIQGATGKFNMVI PLVDGMI VSRRALGFLVRQT 
VINI CRRKRLESDSYS PPHVRRKQKITDIVNKYRNKQLEPEFYT 
SliFQEVGLKNCS S 


6323 


1 


656 


PASTTDGAQE ARVPLDGAFW I PRP PAGS PKGCFACVSKPPALQA 
PAAPAPEPS AS PPMAPTLFPMESKSS KTDSVRAAGAPPACKHIA 
EKKTMTNPTTVIEVYPDTTEVNDYYLWSIFNFVYliNFCCIiGFIA 
LAY S LKVRDKKLLND LNG AVE DAKTDRL I N I TRS GLAAS C I ML W 
MALS VIATHRGLRSSAS ILVAEPHDWNTERPQVTFRERCPAL 


6324 


X 


2061 


EGAGMRRCP CRGSIjNEAE AGALPAAARMGIjEAPRGGRRRQPGQQ 

rpgpgagapagrpegggpwartegsslhseperaglgpapgtes 
pqaefwtdgqtepaaaglgveterpkqktepdrsslrthlewsw 

SELGTTCIiWTETGTDGLWTDPHRSDIiQFQPEEASPWTQPGVHGP 
WTELETHGSQTQPERVKSWADNLWTHQNSSSLQTHPEGACPSKE 
PSADGSWKEIiYTDGSRTQQDIEGPWTEPYTDGSQKKQDTEAARK 
QPGTGGFQIQQDTDGSWTQPSTDGSQTAPGTDCLLGEPEDGPLE 
EPEPGELLTHLYSHLKCSPLCPVPRLI I TPETPEPEAQPVGPPS 
RVEGGSGGFSSASSFDESEDDWAGGGGASDPEDRSGSKPWKKL 
KTVLKYS PFWS FRKH YP WVQLSGHAGNFQAGEDGR I LKRFCQC 
EQRSLEQLMKDPLRPFVPAYYGMVLQDGQTFWQMEDLLADFEGP 
SIMDCKMGSRTYLEEELVKARERPRPRKDMYEKMVAVDPGAPTP 
EEHAQGAVTKPRYMQWRETMSSTSTLG FR I EG I KKADGTCNTNF 
KKTQALEQVTKVLEDFVDGDHVILQKYVACLEELREALEISPFF 
KTHEWGSSLLFVHDH1GLAKVWMIDFGKTVALPDHQTLSHRLP 
WAEGNREDG YLWGLDNM I CLLQGLAQS 


632S 


165 


944 


GI*RDP FRRKRRLKPQVKMSN YVNDMWPGSPQEKDS PSTSRSGGS 
SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRRRHQ 
RKYRR Y SRS YSRS RSRSRSRRYRERR YG FTRRYYRS PSRYRSRS 
RSRS RS RGRS YCGRAYAI ARGQR YYGFGRW YPEEHSRWRDRSR 
TRSRSRTPFRLSEKDRMELLEIAKTNAAKALGTTNIDLPASLRT 
VPSAKETSRGIGVSSNGAKPEVSILGLSEQNFQKANCQI 


6326 


238 


680 


GEPSPATQQKPSATGAGVLHQHFSSGHIYVL.MGLLPPPWTISFT 
VQTTLQPPGGLPAAPVSGRMAFEPVGRDIARRMVPRAGKRTQTL 
G ARRVAAQGAR P L P EDRRP KS GERLHVTVAP CWE FVLP S VS LTA 
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SBQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L»=Leucine, M=Methionine, N=Asparagine, 
Psproline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\— possible nucleotide insertion) 








QAWGGVGQEASSGVP 


6327 


1 


1337 


SLARIiAPAGGS WMPTQQPAAPSTRAPKPSRSLSGS LCALFS DA 
DSGSGMKAELPPGPGAVGRBMTKEEKLQLRKEKKQQKKKRKEEK 
GAEPETGS AVSAAQCQGPTRELPESG IQLGTPREKVPAGRS KAE 
LRAERRAKQEAERALKQARKGEQGGPPPKASPSTAGETPSGVKR 
LPEYPQVDDLUjRRLVKKPERQQVPTRKDYGSKVSLFSHLPQYS 
RQNSLTQFMS IPS S V IHPAMVRIjGJjQYSQGLVRGSNARCI AlrLR 
ALQQVIQDYTTPPNEELSRDLVNKLKPYMSFLTQCRPt^ASMHN 
AI KFLNKE I TS VGS SKREEEAKSELRAAI DRYVQE KI VLAAQAI 
SRFAYQKI SNGDVI L VYGCSS LVSRI LQEAWTEGRRFRWWDS 
R PWLEGRHTLRSL VHAGVPAS YLLIPAAS YVLPEVSTEEKDSKV 
GGEKV 


6328 


1030 


276 


HASAE VTTAAARGLGAMEBE MHTDAK I RAENGTGS S PRG PGCS I» 
RHFACEQNLLSRPDGSASFIiQGDTSVLAGVYGPAEVKVSKE I FN 
KATLEVI LRPKIGLPGVAEKSRERLIRNTCEAVVLGTLH PRTS I 
TWU^WSDAGSLLACCLNAACMALVDAGVPMRALFCGVACALD 
SDGTLVLDPTSKQEKEARAVLTFALDSVERKLLMSSTKGLYSDT 
ELQQCLAAAQAASQHVFRFYRESLQRRYSKS 


: 6329 


3 


2016 


S S EVAAGGGTRS AMAEGSGE VVT VSATGAANGLNNGAGGTSATT 

SNPLSRKI^KILETRLDNDKEMLEALKALSTFFVENSIiRTRRNIi 

RGDIERKS LAINEEFVS I FKEVKEELES I S EDVQAMSNCCQDMT 

SRLOAAKEQTQDL»IVKTTKLQSESQKLiE I RAQVADAFLS KFQI/T 

S D EMS LLRGTREG P I TEDFFKALGRVKQ I HNDVKVLLRTNQQTA 

GLEIMEQMALLQETAYERLYRWAQSECRTLTQES CDVS PVLTQA 

MEALQDRPVLYKYTLDEFGTARRSTVVRGFIDALTRGGPGGTPR 

P I EMHSHDPLR YVGDMLAWLHQAT ASE KE H LE AIjLKHVTTOG VE 

ENIQEWGHITEGVCRPLKVRIEQVIVAEPGAVnLYKISNLIiKF 

YHHTISGI VGNS ATALLTTI EEMHLLSKKI FFNSLSLHASKLMD 

KVELPPPDLGPSSALNQTLMLLREVXiASHDSSWPLDARQADFV 

QVLSCVLD PIjIjQMCTV S AS1MLGTADMATFM VNS DYMMKTTLALF 

EFTDRRLEMLQFQIEAHLDTLINEQASYVLTRVGI^YIYNTVQQ 

HKPEQGSIJU^P^DSVTLKAAMVQ^RYLSA^ 

LS ATVKEQ 1 VKQSTELVCRAYGEVYAAVMNP I NEYKDPEN I LHR 

SPQQVQTLLS 


6330 


1151 


333 


FFYYTFYENKTFSRKMVAEKETLSLNKCPDKMPKRTKLIiAQQPI* 
PVHQPHSLVSEGFTVKAMMKMSWRGPPAAGAFKERPTKPTAFR 
KFYERGDFP I ALEHDS KGNKIAWKVEI EKLD YHHYTjPIj F FDGI>C 
EMTFP YE FFARQGIHDMLEHGGN KI LP VLPQLI I PI KNALNLRN 
RQVICVTLKVLQHLWSAEMVGKALVPYYRQ I LPVLNI FKNMNV 
NSGDGIDYSQQKRENIGDLIQETLEAFERYGGENAFINIKYWP 
TYESCLLN 


6331 


3 


495 


QQGQR VRTRGRRACAS AT PLEGCVDLS YPRTHAALLKVAQMVTL 
LIAFI CVRS S LWTNYS AYS Y FEWTI CDLIM I LAF YLVHLFRFY 
RVLTCI S WPLS E LLH YL I GTLLLLIAS I VAAS KS YNQ SG L VAGA 
I FG FMATFLCMASI WLS YKI SCVTQSTDAAV 


6332 


1 


878 


.VTESNKFDLVSFI PLLRERI YSNNQYARQFI ISWILVLESVPDI 
NLLDYLPEILDGLFQILGDNGKEIRKMCEWLGEFLKEIKKNPS 
SVKFAEMANILVIHCQTTDDLIQLTAMCWMREFIQLAGRVMLPY 
S SG I LTAVLP CLAYDDR KKS I KEVANVCNQS LM KLVTPEDDE LD 
BLRPGQRQAEPTPDDALPKQEGTASGEWTPSIjHIjTSCRGPREPD 
VIGVALGPHLSNQDYFMYVTHTIVAATQRSGSSGS PP FCRQDTG 
KLSTMATHSQLVKTGTGLEPRQAVSSSH 


6333 


3 


1467 


TRTPSEAEAGGESPQSCVSAAHSDWTAGKPVSLIiAPLiPPRSAG 
QPLTFS PSGRQPLRSIiLVGMCSGSGRRRSSLS PTMRPGTGAERG 
GLWMGHPGMHYAPMGMHPMGQRAKMPPVPHGMMPQMMPPMGGPP 
MGQMPGMMSSVMPGMMMSHt^SQASMQPALPPGVNSMDVAAGTAS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid,' E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H~Histidine, I~Isoleucine, K= Lysine, 
L=l»eucine, M=Methionine, N^Asparagine , 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GAKSMWTEHKS PDGRTY Y YNTETKQSTWEKPDDLKTPAEQLLS K 
CPWK2YKSDSGKPYYYNSQTKESRWAKPKELEDUEGYQNTIVAG 
SLITXSNLHAM I KAEES S KQEECTTTSTAPVPTTEI PTTMSTMA 
AAEAAAAVVAAAAAAAAAAAAANAKASTSASNTVSGTVPVVPEP 
E VTS I VATWDN ENTVT I STEEQAQLTST PAI QDQSVEVS SNTG 
E ETS KQETVADFTPKKEEEESQ P AKKTYT WNTKEEAKQAF KEI*L 
KEKRVPSNAS WEQAMKMI INDPRYSAIiAKLSEKKQAFNAYKVQT 
EKK 


6334 


17 


644 


GGNPS GRAAGFAAAAM PS S PLRVAWCSSNQNRSMEAHN 1 liSKR 
GFSVRSPGTGTHVKLPGPAPDKPNVYDFKTTYDQMYNDLLRKDK 
EliYTQNGI I.HMLDRNKRI KPRPERFQNCKDLFDLILTCEERVYD 
QWED LNS REQETCQP VHWNVDI QDNHEEATLGAFL I CELCQC 
IQHTEDMENBIDEL.LQEFEEKSGRTFLHTVCFY 


$335 


82 


529 


AARAR PG VLCCRLLiGAAIiGDQSRVEMS Y I PGQ P VTAWQRVE IH 
KLRQGENLILGFSIGGGIDQDPSQNPFSEDKTDKGIYVTRVSEG 
GPAE I AGLQ IGDKI MQVNGWDMTM VTHDQARKRLTKRS EEWRL 
LVTRQSLQKAVQQSMLS 


6336 


10D3 


438 


HE PAS KGRAEVGNMRLSVAAAISHGRVFRRMGLGPESRIHLLRN 
IJ^TGbVRHERIEAPWARVDEMRGYAEKLIDYGKLGDTNERAMRM 
AD FWLTEKDL I PKLFQVLAPRYKDQTGGYTRMLQ I PNRS LDRAK 
MAV I E YKGNCLP PIjPLPRRDSHLTIjIiNQLIjOGLRQDIiRQ SQEAS 
NHSSHTAQTPGI 


6337 


76 


524 


EG I QML.S VQPDTKPKGCAGCNRKI KDRYLLKALDKYWHEDCLKC 
ACCDCRLGEVGSTbYTKAKIiILCRRDYLRLFGVTGNCAACSKLI 
PA FEMVMRAKDNVYHLDCFACQLCNQR FCVGDKF FLKNNMI LCQ 
TDYREGLMKEGYAPQVR 


633B 


66 


1349 


APNSESGTQGPLPTPANLFWTRRANPDPTTSMSATDRMGPKAVP 
GLRIJ^LLLLGIiGTPKSGVC^QEGLDFPEYDGVDRVINVNAKNY 
KNWKKYEVLALL YHEPPEDDKASQRQFEMEEL I IiELAAQVLED 
KGVGFGLVDSEKDAAVAKKLGLTEVDSMYVFKGDEVIEYDGEFS 
ADT IVE FLLDVLEDPVELI EGERELQAFENI EDE I KLIGYFKS K 
DSEHY KAFEDAAEEFHPYI PFFATFDS KGAKKLTLKLNE I DFYE 
AFMEE PVTI PDKPNSEEE XVWFVEEHRRSTLRKLKPESMYETWE 
DDMDGIHI VAFAEEADPDGFEFLETLKAVAQDNTENPDLS 1 1 Wl 
DPDD F P LLVP YWE KTFDI DLS APQIG WNVTDADRIiWMEMDDEE 
DLPSAEELEDWLEDVLEGE INTEDDDDDDDD 


6339 


246 


1813 


NRCDRGGGGQAERQAGQGCRTQGAGPGFGFGHSFFSQGAMKAFH 
TFCWLLVFGSVSEAKFDDFEDEEDIVEYDDNDFAEFEDVMEDS 
VTES PQRVI ITEDDEDETTVELEGQDENQEGDFEDADTQEGDTE 
SEPYDDEEFEGYEDKPDTSSSKNKDPITIVDVPAHLQNSWBSYY 
LE I LM VTG L LAY I MN Y I IGKNKNS RLAQAW FNTHRELLESNFTXi 
VGDDGTNXEATSTG KLNQENEHI YNLWCSGRVCCEGMLI QLRFI* 
KRQDLLNVLARMMRPVSDQVQIKVTMbTOEDMOTYVFAVGTRKAL 
VRLOKE^DIiSEFCSDKPKSGAKYGIjPDSIiAlLSEMGEVTDGMM 
DTKMVHFLTHYADKIESVHFSDQFSGPKIMQEEGQPliKLPDTKR 
TLLLTFNVPGSGNTYPKDMEAiLPL^MVIYSIDKAKKFRLNRE 
GKQKADKNRARVEENFLKLTHVQRQEAAQS RREE KKRAEKER I M 
WEEDPEKQRRLEEAAI.RREQKKI1EKKQMKMXQIKVKAM 


6340 


2 


583 


EACAHTLSCPAFARDGRARRRPWMSHRTSSTFRAERSFHSSSSS - 
SSSSTSSSASRALPAQDPPMEKALSMFSDDFGSFMRPHSEPLAF 
PARPGGAGNI KTLGDAYEFAVDVRDFS PEDI IVTTSNNHI EVRA 
EKLAADGTVMNNFAHKCQLPEDVDPTSVTSALREDGSLTIRARR 
HPHTEHVQQTFRTE I K I 


6341 


2 


6-45 


KMAVLSAPGLRGFR I LGLRSS VGPAVQARGVHQSVATDGPSSTQ 
PALPKARAVA PKPS SRGE YWAKLDDIjVNWARRSS LW PMTFGLA 
CCAVEMMHMAAPRYDMDRFGWFRASPRQSDVMIVAGTIiTNKMA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re s ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, C=Cysteine, D-Aepartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Hietidine, I^Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X* Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PALRKVYDQMPBPRYWSMGSCANGGGYyHySYSWRGCbRIVP 
VD I YI PG C P PTAEAIjI* YG I LQLQRXI KRERRLQIW YRR 


6342 


2 


1191 


DPRVRAMLATLARVAALRKTCLFSGRGGGRGLWTGRPQSDMNNI 
KPLEGVKILDLTRVLAGPFATMNLGDLGAEVIKVERPGAGDDTR 
TWGP PFVGTESTYYLS VNRNKKS IAVN1 KDPKGVKI I KELAAVC 
DVFVENYVPGKLSAMGLGYEDIDEIAPHIIYCSITGYGQTGPIS 
ORAGYDAVASAVSGLMHITGPEVACIiSHlAANYLIGQKEAKRWG 
TAHGSIVPYQAFKTKDGYIWGAGNNQQFATVCKILDLPELIDN 
SKYKTNHLRVHNRKEblKILSERFEEELTSKWLYLFEGSGVPYG 
P I NNMKNVFAE PQVLHNGLVMEMEHPTVGKI S VPGPAVR YS KF K 
MSEARP F PLLGQHTTHI LKEVLRYDDRAIGELLSAGWDQHETH 


6343 


2 


936 


GTAMVSDEDELNLLVI WDANPI WWGKQALKESQFTI^S KC IDAV 
MVLGNSHLFMrTRSNKLAVIASHIQESRFLYPGKNGRLGDFFGDP 
GNPPEFNPSGSKDGKYEIiLTSANEVIVEEIKDLMTKSDIKGQHT 
ETLLAGSLAKALC Y1HRMNKE VKDNQEMKS RI L V I KAAEDSALQ 
YMNFMNVI FAAQKQNI L IDACVLDSDSGLLQQACDITGGIi YLKV 
PQMPS LLQ YLLW VFTjPDQDQRSQIi I L P PPVHVD YRAACFCHRNL 

IEIGYVCSVCXSIFCNFSPICTTCErAFKlSLPPVLKAKKJCKLK 
VSA 


6344 
'" 6345 


2508 


147 


TM PTATU3NIjRGYGMAS pglaap sltp pqlat pnlqqffpqatr 
QS LLG PPPVG VPMN PSQ FNI>SGRN PQKQARTS S STTPNRKDS SS 
QTMPVEDKSDPPEGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 
PAPEPEPCEASELPAKRLRSSEEPTEKEPPGQLQVKAOPOARWT 
VPKQTQTPDLLPE ALEAQVLPRFQPRVLQVQAQVQS QTQPR I PS 
TDTQVQ PKLQ KQAQTQTS PEHLVLQQKQVQPQLQQSAE PQKQVQ 
PQVQPQAHSQGPRQVQLCK2EAEPLKQVQPQVQPQAHSQPPRQVQ 
IiQLQKQVQTOTYPQVHTQAQPSVQPQEHPPAQVSVQPPEQTHEQ 
PHTQPQVSLLAPEQTPWVHVCGLEMPPDAVEAGGGMEKTLPEP 
VGTQVSMEEIQNESACX3LDVGECENRAREMPGVWGAGGSLKVTI 
LQSSDSRAFSTVPLTPVPRPSDSVSSTPAATSTPSKQALQFFCY 
I CKAS CS S QQEFQDHMSE PQHQQRIjGE I QHMSQACIiIjSLLP VPR 
DVLETEDEE PPPRRWCNTCQL Y YMGDLI QHRRTQDHKI AKQS LiR 
PFCTVCNRYFKTPRKFVEHVKSQGHKDKAKELKSLEKEIAGQDE 
DHFI TVDAVGCFEGDEEEEEDDEDEEE I EVEEBLCKQVRSRDIS 
REE WKGS ETYS PNTAYG VDFLVP VMG YICRICH KFYHSNSGAQL 
S HCKS LGHFENLQKYKAAKNPS PTTRP VSRRCA INARNALTALF 
TSSGRPPSQPNTQDKTPSKVTARPSQPPLPRRSTRLKT 




2 


3483 


PRVRTKL I bljVNDKKR YERVGGGPKRLGRD VEMEEM I EQLQE KV 
HELEKQNDTr^RLISAKQQLQTQGYRQTPYNNVQSRINTGRRK 
ANENAGLQECPRKGIKFQDADVAETPHPMFTKYGNSIjLEEARGE 
IRNLENVIQSQRGQI EEIiEHLAE I If KTQliRRKENE IELSLLQLR 
EQQATDQR SNIRDNVEMI KLHKQLVEKSNALSAMEGKF IQLQEK 
QRTLKI SHDALMANGDELNMQIjKEQRLKCCS LEKQLHSMKFS ER 
RIEELQDRINDLEKERELLKENYDKLYDSAFSAAHEEQWKIiKEQ 
QLKVQIAQLETALKSDLTDKTEILDRLKTERDQNEKLVQENREI> 
QLQYLEQKQQLDEIiKKR I KLYNQENDINADEL»SEAIiIiL>I KAQKE 
QKNGDLS FLVKVDS E INKDLERSMRELQATHAET VQELEKTRNM 
LIMQHKINKDYC^BVEAVTRKMENLQODYELKVEQYVHLLDIRA 
ARIHKLEAQLKDIAYGTKQYKFKPEIMPDDSVDEFDETIHLERG 
ENLFEIHINKVTFSSEVLQASGDKEPVTFCTYAFYDFELQTTPV 
VRGLHPEYNFTSQYLVHVNDLFLQYIQKNTITLEVHQAYSTEYE 

tiaacqlkfheileksgrifctasligtkgdipnfgtveywfri, 

RVPMDQAI RL YRERAKALG YI TSNFKG PEHMQS LSQQ AP KTAQL 
SSTDSTDGNl^IJllTIRCt^LQSRASHIiQPHPYVVYXFFDFA 
DHDTAI I PSSNDPQFDDHMYFPVPMNMDLDR YLKSESLS FYVFD 
DSDTQENI YIGKVNVPL I S LAHDR C I SG I FEI/TDHQKHPAGTIH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G -Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 

— v-»j. xiic , \j ~»jjlu Caininc , xt=/*rg.inxne, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VILKWKFAYLPPSGSITTEDLGNFIRSEEPEVVQRLPPASSVST 
LVIAPRPKPRQRLTPVDKKVSFVDIMPHQSDVSQEGSVDEVKEN 
TEKMQQGKDDVSLLSEGQLAEQSLAS SEDETEI TEDLEPEVEED 
MSASDSDDCI I PGPISKNIKQPSEKI RIEI IALSLNDSQVTMDD 
T I QRL F VECRFYS LPAE ETP VSL P KP KSGQ. WVY YNYSNVI YVDK 
ENNKAKRDILKAILQKQEMPNRSLRFTWSDPPEDEQDI>ECEDI 
GVAHVDLADMFQEGRDL I EQNIDVFDARADGEGIGKLRVTVEAL 
HALQ S VYKQ YR DDLEA 


6346 


2921 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSALTPS IWPQE IL 
AKYTQKEESAEQ PEFY YDEFGFRVYKEEGDEPGSSLLANS PLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
GIPHGMRPQLWMRLSGALQKKRNSELSYREIVKNSSNDETIAAK 
QIEKDLLRTMPSNACFASMGSIGVPRLRRVLRALAWLYPEIGYC 
QGTGMVAACLLLFLEEEDAFWMMSAIIEDLLPASYFSTTI.LGVQ 
TDQRVLRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAFASVV 
D I KLLLRI WDL F FYEGS RVLFQLTLGMLHLKEE E I»I QS ENS AS I 
FNTLSDIPSQMEDAELLLGVAMRLAGSLTDVAVETQRRKHIAYL 
lADQGQLLGAGTLTNLSQWRRRTQRRKSTITALLFGEDDLEAIi 
KAKN I KQTELVADLREAILRVARHFQCTDPKNCS WSRQLPGLI, 
PNTALTPPTPLVGLYSLWQEI/TPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGIi 
RG WFPAKFVE VLDER SKE YS I AGDDSVTEG VTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTHDAVHAQMDVKLRSL 
ICVGLNEQVLHLWLEVLCSSLPTVEKW YQPWS PLRS PGWVQI KC 
ELRVLCCFAFSLSQDWELPAK3*EAQQPLKEGVRDMLVKHHLFSW 
DVDG 


6347 


2921 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSALTPS I WPQE I Ij 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHLE FTHNHD VGDLTWDKIAVS L PRS EKLRS LVLA 
GI PHGMRPQLWMRLSGAIjOK KRNS EliS YRE I VKNSSNDET I aak 
QIEKDLLRTMPSNACFASMGS IG VPRLRRVLRALAWLYPE IGYC 
QGTGMVAACLLLFLEEEDAFWMMSAI I EDLLPAS YFSTTLLGVQ 
TDQRVLRHL I VQ YLPRLDKLLQEHDI ELS L I TLHW FLTAFAS W 
DIKLLLRIWDLFFYEGSRVLFQLTLGMLHLKEEELIQSENSASI 
FNTLSD I PSQMEDAE LLLGVAMRLAGS LTDVAVETQRRKHLAYL 
IADQGQLLGAGTLTNLSQVVRRRTORRKSTITALIiFGEDDLEAli 
KAKNI KQTELVADLREAILRVARHFQCTDPKNCS WSRQLPGLI. 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGL 
RGWFPAK FVE VLDERS KE YS I AGDDSVTEG VTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFI EEAAGREVERDFASVYSRLVL 
CKTFRLDEDG KVLTPEELLYRAVQS VNVTHDAVHAQMDVKLRS L 
ICVGIjNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRSPGWVQIKC 
ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVDG 


6348 


3 


3679 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLIKSMLRNELQFKEE 

klaeqlkqaeelrqykvlvhsqereltqlreklregrdasrsln 
ehlqalltpdepdksqgqdlqeqlaegcrlaqhlvqklspendn 
dddedvqvevabkvqkssspremqkaeekevpedsleecaitcs 
nshgpcdsnqphkni ki tfeede vnstlwdresshdecqdaln 
ilpvpgptssatnvsmwsagplsgekaainileineklrpqla 

EKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECKDL 1 KFMLRN 

erqfkeeklaeqlkqaeelrqykvlvhsqereltqlreklregr 
dasrsijstehlqalltpdepdksqgqdlqeqlaegcrlaqhlvqk 
lspendndddedvqvevaekvqkssaprempkaeekevpedsle 
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| SEQ 
j ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECAITCSMSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 
YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 
CQPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 
ELLDEKEPEVIXJDSLGRCYSTPSGYLELPDLfiOPY^avvcr ttw 

^ ^^^ * * * i. 1.1 1 1 LirJL/uuyri oo/i v X oi-j.ti.t5 

Q YLGLALD VDR I KKDQE EE EDQGP P CPRLSRELLEVVE P E VLQD 
SLDRCYSTPSSCIiEOPI>SCQPYGSSFYALEEKHVGFSI,OVGEIE 
KKGKGKKRRG RRS KKER R RG RKEG EEDQNPP C PRLSRELLDEKG 
PEVIiQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGIjAV 
DMDE I EKYQE VEE DQD PS CPRLSG ELLDEKEP E VLQES LDRC YS 
TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 
DQDPSCPRLSRELLDEKEPEVLQDSU3RCYSTPSGYLELPDLGQ 
PYSSAVYSLEEQYLGIALDVDRIKKDQEEEEDQGPPCPRI^REb * 
LEWEPEVLQDS LDRC YSTP S S CL EQPDS CQP YGSS FYALEE KH 
VGFS LDVGE I EKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 
RIjNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FEE EH I S FAL YVDNRFFTLTVTS LHIjVFQMG VI FPQ 


6349 
6350 " 


3 

• 


3679 


AGAEKCFVTU.ACFLAKQQNKYkYEECKDLIKSMLRNEIiQFKEE 
KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EIILQAIiLTPDEPDKSOGQDLQEQLAEGCRLAQHLVOKLSPENDN 
DDDEDVQ VEVAE KVQKS S S PR EMQKAEEKEVPEDS LKECA I TCS 
NSHGPCDSNQPHiCNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATWVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQ FRNLKEKCFLTQIiACFLANQQNKYKYEECKDLI KFMLRN 
ERQFKEEKLAEQLKQAEETRQYKVLVHSQERELTQLREKLREGR 
DASRSLNEHLQALiLTPDEPDKSQGQDLQEQbAEGCRLAQHIi VQK 
LSPENDNDDDBDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSIiE 
ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 
YSTLS I PPEMIiAS YKS YSS T FHSLEEQQVCMAVD I GRHRWDQVK 
KEDHBATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 
CQP YRSAFYVLEQQRVGLAVNMDE IEKYQEVEEDQDPSCPRLSR 
ELLDEKEPEVLQDSLGRCYSTPSGYLELPDF.f^PVQcsaxrvcT z<r> 

QYI^IAIJJVDRIKKDQEEEEDQGPPCPRLSREIjLEVVEPEVLQD 
SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFSLDVGEIE 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRiSRELLDEKG 
PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 
DMDEIEKYQEVEEDQDPSCPRLSGBLLDEKEPEVLQESLDRCYS 
TPSGCIiELTDSCQPYRSAFYILEQQRVGLAVDMDBIEKYQEVEE 
DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
PYSSAVYSI^EQYLGIALDVDRIKKDQEEEEDQGPPCPRLSREL 
LE WE PE VLQDSLDRCYST PS S CLEQPDS CQP YGSS F YALEEKH 
VGFSLDVGE I EKKG KGKKRRGRRSKKERRRGRKEGEE DQNP PC P 
RLNSMI^MEVEEPE VliQDSLDI C YSTPSM YFEL PDS FQH YRS VFY 
S FEEEH I S FAL YVDNRFFTLTVTSLHLVFQMGVT FPQ 




3 


3679 


AGASKCFVTLLACFLAKQQNKYKYEECKDLI KSMLRNBLQFKEE~~ 

KIAEQLKQAEELRQYKVL-VHSQERELTQLREKLREGRDASRSIiN 

EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN 

DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 

NSHGPCDSKQPHKN I K I TFEEDEVNSTLWDRESSHD E CQDALN 

ILPVPGPTSSATNVSMVVSAGPLSGEKAAINILEINEKLRPQLA 

EKKQQFRKLKEKCFIjTQIjACFtiANQQNKYKYEECICDLI KFMLRN 

BRQ FKEEKLAEQLKQAEELRQ YKVLVHSQERELTQLRE KLREGR 

DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 

LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or r e spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I>=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q-Glut amine, R=Arginine, 
S -Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTIilGSSSHVEW 
EDAVHIIPENESDDEEEBEKGPVSPRNLQESEEEEVPQESWDEG 
YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLEIiTDS 
CQPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 
EliLDEKEPEVLODSIiGRCY < ?TP t ;f?VT.PT.PnT.r:r>DV<;QairvcT w 

QYLGLALDVDRIKKDQEEEEDQGPPCPRLSRELLEWEPEVLQD 
S DDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFS LDVGEI E 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 
PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 
DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDSCQPYRSAFYI LEQQRVGLAVDMDEI EKYQEVEE 
DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLEljPDLGQ 

PYfiSAW^T,PPnVT/lT,aT.r\X/nD T Vvrv^OPrrnn^Tinr'Tin t ot>t»» 
itiqoav lOJJliriVll^lJrtJjjJVIJKJ. lUUJvl£iiJililjy^PFC_PKljSREL» 

LEWEPEVLQDSLDRCYSTPSSCLEOPDSCQPYGSSFYALEEKH 
VGFSLDVGE I EKKGKGKKRRGRRS KKERRRGRKEGEEDQNPPCP 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FEEEHI SFALYVDNRFFTLTVTSLHLVFQMGVIFPQ 


6351 


12 91 


319 


REARRRTERSQLGRMLWEVANGRSLVWGAEAVQAIjRERIjGVGG 
k i vvj/Uj±^u*jFKyWtoKlXjJjPij]jJjiyiPEEARIj 

DSRHHSLAIjTSFIC^O^EESFQEQSALAAEARETRRQELLEKITE 

gqaakkqkleqasgasssqeagssqaakedetsdgqasgeqeea 
gpsssqagpsngvaplprsaliivqiatarprpvkarpldwrvqs 
kdwphagrpahelrysiyrdlwergfflsaagkfggdfl.vypgd 

PLR FHAH Y 1 AQCWAPEDTI PLQDLVAAGRLGTS VRKTLLLCS PQ 
PDGKWYTSLQWASLQ 


6352 


235 


923 


WSEWLSPCHAAKCKGLSMLRITMKTRAISLAADATEFVQGRSAP 
AMARS bVHDT VF YCIiS VYQVKIS PTPQLGAAS S AEGHVGQGAPG 
LMGNMNPEGGVNHENGMNRDGKM T PKGGGctwof PR on pn pbdrp 

PAQAAMEGPQPENMQPRTRRTKFTLbQVEELESVFRHTQYPDVP 
TRRELAENLGVTEDKVRVWFKNKRARCRRHQRELMLANE1JIADP 
DDCVYIWD 


6353 


65 


672 


RFAGAGAI PEARARPPDVQAAEEEKEMDLPDSASRVFCGRI I,SM 
VNTDDVNAIIl^QKNMLDRFBKTNEMLLNFNNLSSARLQQMSER 
FLHHTRTLVEMKRDLDS I FRRI RTLKGKIiARQHPEAFSH I PEAS 
FLEEEDEDPIPPSTTTTIATSEQSTGSCDTSPDTVSPSLSPGFE 
DLSHVQPGSPAIWGRSQTDDEEMTGE 


6354 


965 


510 


PSLRPMEPTRDCPLFGGAFSAILPMGAIDVSDLRPVPDNQEVFC 
HPVTDQSLIVFXLELQAHVRGEAAARYHFEDVGGVQGARAVHVE 
S VQPLS LENLALRGRCQEAWVLSGKQQI AKENQQVAKDVTLHQA 
LLRLPQYQTDLLLTFNQPP 


6355" 


158 


1662 


RGSSAAFRGSGLRGAMIRRVLPHGMGRGLLTRRPGTRRGGFSbD - " 
WDGKVSEIKKKIKSILPGRSCDIjLQDTSHLPPEHSDWIVGGGV 
IjGI^VAYI^LKKLESRRGAIRVTiVVERDHTYSQASTGLSVGGICQ 
QFSLPENIQIiSLFSASFLRNINEYLAVVDAPPLDLRFNPSGYLL 

ij^ekdaaamesnvkvqrqegakvslmspdqlrnkfpwintegv 
ai^asygmedegwfdpwcllqglrrkvqslgvlfcqgevtrfvss 
sqrmlttddkavvlkrihevhvkmdrsleyqpvecaivinaaga 
WS AQIAALAGVGEGPPGTLQGTKLPVE prkr yvyvwhcpqgpgl 
etplvadtsgayfrreglgsnylggrs pteqeepdpanlevdhd 

FFQDKVWPHIiALRVPAFETLKVQSAWAGYYDYNTFDQNGWGPH 
PIj WNM Y FATGFSGHGLQQAPGIGRAVAEM VLKGRFQTI DLS P F 
LFTRFYLGEKIQENNI I 


6356 


354 


633 


TGLTSSCLPLQVMMTKRTKDMGKFSS VTVSTIDEEEBE I BARE V ' " 
ADS YAQNAKV I EKQLERKGMS KRRLQELABLEAKKAKMKGTLID 
NQFK 
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ID 
NO: 
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beginning 
nucleotide 
location 

COfreSTinnH-5 nn 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=»Histidine, I = Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unienown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6357 


2 


915 


GLLRNMALLVRVLRNQTS ISQWVPVCSRLI PVSPTQGQGDRALS 
RTSQWPQMSQSQACGGSEQIPGIDIQLNRKYHTTRKLSTTKDSP 
QPVEEKVGAFTKIIEAMGFTGPLKYSKWKIKIAALRMYTSCVEK 
TDFEEFFLRCQMPDTFNSWFLITLLHVWMCLVRMKQEGRSGKYM 
CRI I VHFMWEDVQORGRVMGVNPY ILKKNMILMTNHFYAAILGY 
DEGILSDDHGLAAALWRTFFNRKCEDPRHLELLVBYVRKQIQYL 
DSMNGEDLLLTGEVSWRPLVEKNPQSILKPHSPTYNDEGL 


6358 


2009 


1040 


AS DALHSLS A P VLRLS SRS AAR PATMTEQAI S FAKD FLAGG I AA 
AISKTAVAPIBRVKLLLQVQHASKQIAADKQYKGIVDCI VRI PK 
EQGVLSFWRGNLANVIRYFPTQAIiNFAFKDKYKQI FliGGVDKHT 
Q FWR YFAGNIiAS GGAAGATSLCFVY PLDFARTRLAADVG KSGTE 
REFRGLGDCLVKI TKSDGIRGLYQGFSVSVQGI 1 1 YRAAYFGVY 
DTAKGMLPDPKNTHI WS WM IAQTVTAVAG VVS YPFDTVRRRMM 

MQSGRKGADIMYTGTVDCWRKIFRDEGGKAFFKGAWSNVLRGMG 
GAFVLVLYDELKKVI 


6359 


98 


1086 


VCRQEEEKMKEDCLPSSHVPISDSKSIQKSELLGIiLKTYNCYHE 
GKSFQLRHREEEGTLIIEGLLNIAWGLRRPIRU2MQDDREQVHL 
PSTSWMPRRPSCPLKEPSPQNGNITAQGPSIQPVHKAESSTDSS 
GPLEEAEEAPQLMRTKSDASCMSQRRPKCRAPGEAQRIRRHRFS 
INGHFYNHKTSVFTPAYGSVTNVRVNSTMTTLQVLTLLLNKFRV 
EDGPSEFALYIVHESGERTKLKDCEYPLISRILHGPCEKIARIF 
LMEADLGVEVPHEVAQYI KFEMPVLDSFVEKLKEEEEREI IKLT 
MKFQALRLTMLQRLEQLVEAK 


6360 


1 


345 


GTRGAVPSTLEE WL»PPRS CRVFW IHSGTTMS KVS FKITLTSDP ' "j 
RLPYKVLSVPESTPFTAVJJKFAAEEFKVPAATSAI ITNDG IGIN 
PAQTAGNVFLKHGSELRI I PRDRVGSC . 


6361 


615 


158 


RPGLGQLQHCAI.APQAGNRRCRFHGRLHALTRSTHRGKPMS IMQ 
FKDTLNTPLPDS S P VAVPLGAP I AVASTLS VEHNDGVETG I WAC 
APGRWRRQITSQEFCHFIQGRCTFTPDDGETLHIOAGDALMLPA 
NSTG I WD I QET VRKTYVL 1 1> 


6362 


350 


1576 


1-lTaiXiSHSAAI.KliQQLPPTSSSSAVSEASFSYKENLIGALIAIF 
GHLWS IALNLQKYCHI RIiAGSKDPRAYFKTKTWWLGLFLMLLG 
ELGVFASYAFAPLSLIVPLSAVSVIASAIIGIIFIXEKWKPKDF 
LRRYVLS FVGCGLAWGTY LLVT FAPNS HEKMTG ENVTRHLVS » 
PFLLYMLVEI ILFCLLLYFYKEKNANN I WILLLVALLGSMTW 
TVKAVAGMLVLS I QGNLQLD YP I F YVMFVCMVATAVYQAAFLSQ 
ASQMYDSSLIASVGYILSTTIAITAGAIFYLDFIGEDVLHICMF 
ALGCLIAFLGVFL I TRNRKK? I PFEP YISMDAMPGMQNMHDKGM 
TVQPEWCASFSYGALENNDNISEIYAPATLPVMQEEHGSRSASG 
VPYRVXEHTKKE 


6363 


21 


1201 


RRTRLGSS FPRRRDS S AMES YDVIANQP WIDNGSGVI KAGFAG 
DQI PKYCFPNYVGRPKHVRVMAGALEGDI FIGPKAEEHRGLLS I 
RYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPI, 
NPRKNRERAAEVFFETFNVPAIjFT ^MDivVT.<;T.va , rr:»'T*Tr^r^7T n 

SGDGVTHAVPI YEGFAMPHS I MRI D I AGRDVSRFLRLYLR JCEG Y 
DFHSSSEFEIVKAI KERACYLS INPQKDETLETEKAQYYLPDGS 
TIEIGPSR FRAPELL FRPDL I G EESEG IHE VL VFAI QKSDMDLR 
RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQB 
RliYSTWIGGSILASLDTFKKMWVSKKEYEEDGARS IHRKTF 


63 6"4 


21 


1201 


RRTRLGS S F PRRRDS SAM E S YDVI ANQ PWI DNGSGVI KAG FAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSI 
RYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
NPR KNRERAAEVFFETFNVPALFI SMQAVLS LYATGRTTGWLD 
SGDGVTHAVPI YEG FAMPHS IMRIDIAGRDVSRFLRLYLRKEGY 
DFHSSSEFE I VKAI KERACYLS INPQKDETLETEKAQYYLPDGS 
TI3IG PS R FRAPELL FRPDL IGEESEG I HE VLVFAIQKSDMDLR 
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amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine , 6=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline. Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *~Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTLFSNIVLSGGSTLFKGFGDRLI»SEVKKIAPKDVKIRISAPQE 
RLYSTWIGGS ILASLDTFKKMWVS KKE YEEDGARS IHRKTF 


6365 


234 


1939 


KHKSRASCAARAQAFGPSREREVHSRFRSGLRRLGESCJSGCCTM 
ASMGTLAFDEYGRPFLIIKDQDRKSRLMGLEAIjKSHIKAAKAVA 
NTMRTS LG PNGLDKKMVDKD3DVT VTNDGAT I LS MMDVDHQ I AK 
LMVELSKSQDDE I GDGTTGVWIiAGAIiLEEAEQIjLDRG I HP I RI 
ADGYEQAARVAI EHLDKI S DSVLVD I KDTE PL I QTAKTTLGSKV 
WSCHRC^AEIAVNAVLTVADMERRDVDFELIKVEGKVGGRbED 
TKI.IKGVIVDKDFSHPQMPKKVEDAKIAILTCPFEPPKPKTKHK 
LDVTS VEDYKALQKYKKEKFEEM I QQI KETGANLAICQWGFDDE 
ANHLIiLQNNIjPAVRWVGG PE I EI»I A I ATGGRI VPRFSEI/TAEKL 
GFAGLVQE I S FGTTKDKMLVIEQC KNS RAVT I F I RGGNKM I IEE 
AKRS LHDALCVIRNLI RDNR WYGGGAAE IS CALAVSQEADKCP 
TLEQYAMRAFADALEVIPMAIjSENSGMNPIQTMTEVRARQVKEM 
NPAI.GIDCLHKGTNDMKQQHVIETLIGKKQQ I SIiATQMVRM I IjK 
IDDIRKPGESEE 


6366 


257 


1898 


GNKEGAHSSTFWVLDS 1 FLGA VAMLCKEQGI TVLGLNAVFD II*V 
IGKFNVLE I VQKVLHKD KSLENLGMLRNGGLLFRMTLLTSGGAG 
MLYVRWRIMGTGPPAFTEVDNPAS FADSMLVRAVNYN YY YS LNA 
WLI*LCPWWLCFDWSMGCIPLIKSISDWRVIALAALWFCLIGI»IC 
QALCSEDGHKRR I LTLGLGFLVI P FIjPASNLFFRVGFWAERVL 
YLPS VG YC VLI/TFG FGAIiS KHTKKKKL I AAWIiG I LFINTLRCV 
LRSGEWRSEEQLFRSALSVCPIjNAKVHYNIGKNLADKGNQTAAI 
RYYREAVRIiNPKYVHAMNNLGNIIiKERJJELQEAEELLSLAVQIQ 
PDFAAAWMOT-GIVQNSIjKRFEAAEQSYRTAIKHRRKYPDCYYNL 
GRLYADLNRHVDALNAWRNATVLKPEHSLAWNNMI 1 LLDNTGNI* 
AQAEAVGREALEL 1 PNDHSU4FSLANVLGKSQKYKESEALFLKA 
IKANPNAASYHGN1AVLYHRWGHLDLAKKHYEISLQLDPTASGT 
KENYGLLRRKI/ELMQKKAV 


6367 


287 


1934 


S IGFPVMLVLS I LsLYTCEMFQDSVAFEDYAVS FTQEEWALLDPS 
QKNLYRD VMQETFKNLTS VGKTWKVQNI EDEYKNPRRNI»SIiMR E 
KLCESKESHHCGESFNQIADDMIjNRKTLPGITPCESSVCGEVGT 
GHSSLNTHIRADTGHKSSEYQEYGENPYRNKECKKAFSYLDSFQ 
SHDKACTKEKPYDGKECTETFISHSCIQRHRVMHSGDGPYKCKF 
CGKAFYFLNLCXIHERIHTGVKPYKCKQCGXAFTRSTTLPVHER 
THTGWADECKECGNAFSFPSEIRRHKRSHTGEKPYECKQCGKV 
FISFSSIQYHKMTHTGEKPYECKQCGKAFRCGSHLQKHGRTHTG 
EKPYECRQCGKAFRCTSDLQRHEKTKTEDKPYGCKQCGKGFRCA 
SQI^IHERTHSGEKPHECKECGKVFKYFSSLRIHERTHTGEKPH 
ECKQCGKAFRYFSSLHIHERTHTGDKPYECKVCGKAFTCSSSIR 
YHERTHTGEKPYBCKHCGKAFISNYIRYHERTHTGEKPYQCKQC 
GKAFIRASSCREHERTHTINR 


6368 


1 


327 


RPVPAKLNPRSWPRTAGALPLRP PPLTMAV FHDEVE I EDFQYDE 
DSSTYFYPCPCGDNFS I TKEDLENGEDVATCP SCS Ii I IKVIYDK 
DQFVCGET V PAPS ANKELVKC 


6369 


3. 


1745 


AGCCRDTRFPTPRGPGS LCHNFCRSAACT VL'RTIHG S PREDTGT 
PRSREMMFQDSVAFEDVAVSFTQEEWALLDPSQBCNLYRDVMQET 
F KNL/TS VG KTWKVQN I EDEYKNPRRNLSLMREKLCES KESHHCG 
ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGHSSLNTHIRAD 
TGHKSSEYQEYGENPYRNKECKKAFSYLDSFQSHDKACTKEKPY 
DGKECTBTFISHSCIQRHRVMHSGDGPYKCKFCGKAFYFLNLCIi 
I HER IHTGVKPYKCKQ CGKAFTRSTTLP VHERTHTG VNADECKE 
CGNAFS FPS E IRRHKRSHTGE KP YE CKQCGKVF I SFS S I Q YHKM 
THTGEKP YECKQCGKAFRCGSHLQKHGRTHTGEKPYE CRQ CGKA 
FRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQIHERTHSG 
EKPHECKECGKVFKYFSSLRIHERTHTGEKPHECKQCGKAFRYF 
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amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S-Serine, T=- Threonine , V=Valine, 
W= Tryptophan, Y=Tyro3ine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








S S LiH I HBRTHTGD K P YECK VCG KA FTCSS S I RYHERTHTG EKP Y 

ECKHCGKAFISNYIRYHERTHTGEKPYQCKQCGKAFIRASSCRE 
HERTHTINR . 


6370 


1711 


329 


FVLSEQRLRTERTWPRSPGLGRGAAAAGARTAGAGLLRbliLGCG 
ALVGGLR P VTMTTPANAQNAS KTWE LSL YELHRTPQ E A I MDGTE 
I AVS PRS LHS E LMCP I CLDKLKNTMTT KE C LHRFCSD C I VTALR 
SGNKECPTCRKKL VSKRSLRPDPNFDALIS K I YPSREE YEAHQD 
RVLI RLSRLHNQQALS SS IEEGLRMQAMHRAQRVRR P I PGSDQT 
TTMSGGEGE PGEGEGDGEDVSSDSAPDSAPG PAPKRPRGGGAGG 
SSVGTGGGGTGG VGGGAGSEDSGDRGGTLGGGTLGP PS P PGAPS 
P PE PGGE I ELVFRPHPLLVEKGEYCQTR YVKTTGNATVDHLSKY 
! LAI*R I ALERRQQQEAGEPGGPGGGASDTGGPDGCGGEGGGAGGG 
DGPEHPALPSLEGVSEKQYTIYIAPGGGAFTTIiNGSIiTliEIiVNE 
KFWKVSRPLELCYAPTKDPK 


6371 


3 


288 


GVANMSTAMNFGTKSFQPRPPDKGSFPLDKLGECKSFKEKPMKC 
LHNNNFENALCRKESKEYLECRMERXLMLQEPLEKLGFGDLTSG 
KSEAXK 


6372 


2141 


625 


RVSAI ASEGKAE ER YKKLEDLLEKS FS LVKMPSLQPVVMCVMKH 
LPKVPEKKLKLVMADKELYRACAVEVRRQIWQDNQALFGDEVSP 
LLKQYILEKESAIiFSTEljSVIiHNFFSPSPKTRRQGEVVQRLTRM 
VG KNVKLYDM VI»QFT»RTLFIjRTRNVH Y CTLRAELLMS LHDLDVG 
EICTVDPCHKFTWCLDACIRERFVDSKRARELQGFLDGVKKGQE 
QVLGDtiSM I LCDPFAINTLALSTVRMLQELVGQETLPRDSPDLL 
LLLRLLALGQGAWDKIDSQVFKEPKMEVELI TRFLPMLMS FliVD 
D YT FNVDQ KLPAEE KAP VS YPNTL PES FTKFLQEQ RMACEVGL Y 
YVLHI TKQRNKNALLRLLPGLVETFGDLAFGDI FLHJLLTGNIiAL 
liADEFAIiEDFCSSLFDGFFLTASPRKENVHRHALRLIiIHLHPRV 
APSKLEALQKAIjEPTGQSGEAVKELYSOLGEKLEQLDHRKPSPA 
QAAETPALEI»PL?SVPAPAPL 


6373 


67 


711 


psraaras parlpamvs w i i3rlwli fgtlypayy s ykavks k 
dikeyvkwmm yw 1 1 falfttaetftdi flcwfpfy yelkiafva 
wllspytkgssulyrkfvhpti>sskekeiddclvoakdrsydal 
vhfgkrglnvaataavmaas kgqgalserlrs fsmqdlttirgd 
gapapsgppppgsgrasgkhgqpkmsrsasesasssgta 


6374 


53 5 


2105 


HKLFCSYISTSEFPSSTRHHSCPTHTFCNYTSSTIFLSSTRDHS 
CPTHTFCNYTSSTI FLSSTRDHSCPTHTSCNYTSSTI FLSSTRD 
HSCPTHTSCNYTSSTI FLSSTRDHSCPTHTFCNYPRPI IRLSSC 
CPAEI,QTEGSNGKKEVLSGFQVVLEDTVLFPEGGGQPDDRGTrN 
DISVLRVTRRGEQADHFTQTPLDPGSQVLVRVDWERRFDHMQQH 
SGQHLI TAVADHLFKUCTTS WELGRFRSAI ELDTPS MTAEQVAA 
I EQSVNEKIRDRLPVNVRELSLDDPEVEQVSGRGLPDDHAGP IR 
WNIEGVDSNMCCGTHVSNLSDLOVIKILGTEKGKKNRTNLI FL 
SGNRVliKWMERSHGTEKALTALLKCGAEDHVEAVK KT *QNST KI L 

ANEIGSEETLLFLTVGDEKGGGLFLIiAGPPASVETLGPRVAEVI, 
EGKGAGKKGRFQGKATKMSRRMEAQALLQDYI STQSAKE 


6375 


1 


1535 


AIMAAATRPVRLPEAGCEGRERCWNPSRSRSHSGEGGLAAWSRT 
CPGRPRRPGQQVVRGPTMLVTAYLAFVGLIASCLGLELSRCRAK 
PPGRACSNPSFLRFQLDFYQVYFLAIjAADWLQAPYLYKIiYQHYY 
FIiEGQ IAI L YVCGLAST VLFGLVASSIiVDWLGRKNS C VLFSI/T Y 
SLCCIiTKLSQDYFVLLVGRALGGLSTALLFSAFEAWYIHEHVER 
HDFPAEWIPATFARAAFWNHVLAVVAGVAAEAVASWIGIiGPVAP 
FVAAI PLLAIAGAliAI*RNWGENYDRQRAFSRTCAGGLRCLLSDR 
RVLLLGTI QALFESVI FI FVFLWTPVLDPHGAPLGI I FSSFMAA 
SIXGSSLYR I ATSKRYHLQPMHLLS LAVLI WFSLFMLTFSTS P 
GQESPVESF I AFI»IiIELACGIiYFPSMSFLRRKVI PETEQAGVLN 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E- 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, ! 
L=Leucine, M=Methionine, N^Asparagine, I 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T« Threonine, V=Valine, 
W=Tryptophan, Y -Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








WFRVPLHS IjACLGLLVLHDS DRKTGTRNMFS I CS AVM VMALLAV 
VGLFTWRHDAELRVPS PTEEP YAPEL 


6376 


380 


1437 


ISSTDIDHYRFSFLVNSKMPSKESWSGRKTNRAAVHKSKQEGRQ 
QD1.LI AALGMKLGSPKSS VT I WQPLKLFAYSQLTS LVRRATI>KE 
NEQI PKYEKIHNFKVHTFRG PHWCE YCANFMWGLI AQGVKCADC 
GLNVHKQCSKMVPNDCKPDLKHVKKVYSCDLTTLVKAHTTKRPM 
WDMCIRE I ESRGLNSEGLYRVSGFSDLIEDVKMAFDREX3EKAD 
I SVNMYEDINIITGALKLYFRDI.PI PLITYDAYPKFI ESAKIMD 
PDEQLETLHEALKLLP PAH CETLR YLMAHL KR VTLHE KENLMNA 
ENLGI VFGPTLMRS PELDAMAALNDIRYQRLWELLI KNEDI LF 


6377 


2311 


1845 


SRI RRRS SRRPRE P PGPSRRRRRRRPDPRTMP SE KTF KQRRTFE 
QRVEDVRLI REQHPTKI PVI IERYKGEKQLPVLDKTKFLVPDHV 
NMSELI KI I RRRLQLNANQAFFLLVNGHSMVSVSTP I SEVYESE 
KDEDGFLYMVYASQETFGMKLSV 


6378 


686 


191 


GAGPWEAFPDGIGRRSRRARLPQYKRPPGRVGGGDSGRRNMAVA 
DLALIPDVDIDSDGVFKYVLIRVHSAPRSGAPAAESKEIVRGYK 
WAE YHAD I YDXVSGDMOKOGCDPF CT /ZCZClt) T Q HDOnnv V THVvr 
YSMAYGPAQHAISTEKI KAKYPDYEVTWANDGY 


6379 


35 


378 


BRAGS PS PSRAALRRCAPQRSQAPRWPDRAACRRS FQGS QGRAY 
LFNS VVNVGCGPAEERVLLTGLHAVAD I YCENCKTTLG W KYEHA 
FESSQKYKEGKYI I ELAHMI KDNGWD 


6380 


1414 


462 


PAVQGQRGAGPPTGRGSGNMARFALTVVRHGETRFNKEKIIQGQ 
GVDEPLSETGFKQAAAAGI FX.NNVKFTHAFSSDLMRTKQTMHGI 
LERSKFCKDMTVKYDSRLRERKYGWEGKALSELRAMAKAAREE 
CPVFTPPGG ETLDQVKMRG I DFFE FLCQLI LKEADQKEQFSQGS 
PSNCLETS IxAE I FPLGKNHSSKVNSDSGI PGLAAS VLWSHGAY 
MRSLFD Y FLTDLKCSLPATLSRS ELMS VTPNTGMSL FI I NFEEG 

LALFTSLLC 


6381 


1668 


218 


AWRAQGS RGFSGAGWRPRQAAAMNFS EVFKLSS LLCKFS PDGK 
YLAS CVQ YRLWRDVNTLQ I LQLYTCLDQI QH I EWSADSLF ILC 
AMYKRGLVQVWSLEQPEWHCKIDEGSAGLVASCWSPDGRHILNT 
TEFHLRITVWSLCTKSVSYIKYPKACU5GITFTRDGRYMALAER 
RDCKDYVS I FVCSDWQLLRHFDTDTQDLTG I EWAPNGCVLAVWD 
TCLE YKI LLYS LDGRLLST YSAY EWS LG I KS VAWS PSSQ FIAVG 
S YDGKVR I LNHVTWKM I TEFGHPAAIND PKI WYKEAEKS PQLG 
I^CI^FPPPRAGAGPLPSSESKYEIASVPVSI*QTLKPVTDRANP 
OGIGMIjAFSPDSYFLATRNDNIPNAVWVWDIQKLRLFAVLEQL 
SPVRAFQWDPQQPRLAI CTGGSRLYLWS PAG CMS VQ VPG EGD FA 
VLSLCWHLSGDSMALLSKDHFCLCFLETEAVVGTACRQLGGHT 


6382 


2 


1062 


FEEDEDRNLCLIAYPLKGDHGIVDIVDNSDCEPKSKLLRWTTNK 
KHHVLETEKTPKDWVRQHRKEEKMKSHKLEEEFEWLKKSEVLYY 
TVEKKGN I SSQLKHYNPWSMKCHQQQLQRMKENAKHRNQ YKF I L 
LENLTS R YEVPCVLDLKMGTRQHGDDASEEKAANQ IRKCQQSTS 
AVIGVR VCGMQVYQAGSGQLM FMNKYHGRKLS VQG FKEALFQFF 
HNGRYLRRELI/3PVLKKLTELKAVLERQESYRFYSSSLLVIYDG 
KERPEWLDSDAEDLEDLSEES ADESAGAYAYKP I GASS VDVRM 
I DFAHTTCRLYGEDTWHEGQDAGYI FGLQSLID I VTEISEESG 
E 


6383 


3159 


1061 


SPAPGRPSPHGSQPAARAAAAPAMPSAKQRGSKGGHGAASPSEK 
GAHPSAARPLAAPTPAAPACRS PS PGGAPAS FPGRAPRSLASQP 
AARAAAAPAMPSAKQRGSKGGHGAASPSEKGAHPSGGADDVA2CK 
P PPAPQQP P P PPAPH PQQHPQQHPQNQAHGKGGHRGGGGGGGKS 
SSSSSASAAAAAAAASSSASCSRRLGRALNFLFYLALVAAAAFS 
GWCVHHVLEEVO^VRRSHQDFSRQREELGOGLOXSVEQKVQSLQA 
TFGTFESILRSSQHKQDLTEKAVKQGESEVSRISEVLQKLQNEI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=lieucine, M=Methionine, N=Asparagine, 
P-Proline, Q^Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKDLSDG IHWKDARERDFTSLENTVEERLTELTKS INDNI AI F 
TEVQKRSQKEINDMKAKVASLEESEGNKQDLKALKEAVKEIQTS 
AKSREWDMEALRSTLQTMESDI YTEVRELVS LKQEQQAFKEAAD 
TERLALQALTEKLLRSEESVSRLPEEIRRLEEEIiRQLKSDSHGP 
KEDGG FRHS E AFEALQQKSQGIjDS RLQHVEDG VLS MQVASARQT 
ESLESL»L5?KSnRHPORT»AfcT/^f^PT.T?f2T^QCi?artf'"krM"" , T a cnrocT 

GETQLVLYGDVEEliKRSVGELPSTVESLQKVQEQVHTbLSQDQA 
QAARIj P PQDFLDRLS S IiDNLKAS VSQ VEADIiKMLiRTAVDSIiVAY 
SVKIETNENNLESAKGLLDDLRNDLDRLFVKVEKIHEKV 


6384 


73 8 


1904 


I WEVPVCLTHLLHLQQANQPLP P PSSS INEEDADEANRAIGEKR 
AAPDSGKKPKTPKTKQQKDPNEPQKPVSAYALFFRDTQAAIKGQ 
N PNATFGEVS Q I VAS M WDSLGE EQKQV Y KR KTEAAKKE YLKALA 
AYRASIiVS KAAAESAE AQTIRS VQQTLASTNI*TS SLLLNTPLSQ 
UGTVSAS PQTLQQSL PRS I APKPLTMRL PMNQ I VTS VT I AANM P 
SN I GAP Ii I S SMGTTMVGS APSTQVS PS VQTQQHQMQI>QQQQQQQ 
QQQMQQMQOQQDQQHQMHQQIQQQMQQQHFQHHMQQHLQQQQQH 
LQQQ I NQQQLQQQLQQRLQLQQLQHMQHQS QPS PRQHS P VASQI 
i £> fa JICj£>PQPASQQHQSQIQSQTQTQV1jSQVS1 f 


6385 


2 


1584 


PRVRAADVAAGAQAWSAGMAKSNGENGPRAPAAGESLSGTRES 
LAQGPDAATTDELSSLGSDSEANGFAERRIDKFGFIVGSQGAEG 
ALEEVPLEVLRQRES KWLDMLNNWDKWMAKKHKKIRIiRCQKCil P 
PSLRGRAWQYLSGGKVKLQQNPGKFDELDMSPGDP KWLDVI ERD 
LHRQF P FHEM FVS RGGHGQQDL FRVLtKAYTL YRPEEG YCQAQAP 
I AA VJbLMHM P AEQAFWCIiVQICE KYLPGYYS E KX»E AIQLDGE I L 
FSI»LQKVSPVAHKHLSRQKIDPLLYMTEWFMCAFSRTI*PWSSVL 
RWDMFFCEGVKIIFRVGLVLLKHALGSPEKVKACCX3QYETIER 
LRSLSPKIMQEAFLVQEWEIiPVTERQIEREHLIQLRRWQETRG 

ETiOC r R c IPPPT.WOATf ATT\r>APPrtDI?I>2iT J^DCCCTdt dt nam nr- r» 

KAKP KP PKQAQKEQRKQMKGRGQLE KP PAPNQAM WAAAGDACP 
PQHVP P KDSAP KDS APQDLAPQVSAHHRSQESLTS QES EDTYL 


6386 


819 


195 


TVCGS F YLG I MQRASRLKRELHMIATEPPPGX TC WQDKDQMDDL 
RAQI LGGANTPYEKGVFKLEVI I PERYPFBPPQIRFLTP I YHPN 
IDSAGRICLDVLKIiPPKGAWRPKTiNT ATUT.TC TnT.T.MQP'DMDnn 
PLMADI SSEFKYNKPAFLKNARQWTEKHARQKQKADEEEMLDNL 
PEAGDSRVHNSTQKRKASQLVGIEKKFHPDV 


6387 


1 


662 


PGPTHASADAWADAWAQPKMAMHNKAAPPQI PDTRRELAELVKR 
KQEIiAETIiANLERQ I YAFEGS YLEDTQM YGNI IRGWDRYLTNQK 
NSNS KNDRRNR KFKEAERLFS KSS VTS AAAVSAIAGVQDQL I EK 
REPGSGTESDTSPDFHNQEWEPSQEDPEDLDGSVQGVXPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNKKPRADY 


6388 


1 


662 


PGPTHASADAWADAWAQPNMAMHNKAAPPQI PDTRRELAELVKR 
KQELAETLANLERQIYAFEGSYLBDTQMYGNIIRGWDRYIiTNQK 
NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSAIiAGVQDQLIEK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKIiNKKPRADY 


6389 


1074 


497 . 


AEPGDRMAGHRLVLVLGDLHIPHRCNSLPAKFKKIiLVPGKIQHI 
LCTCNLCTKESYDYLKTLAGDVHIVRGDFDENLNYPEQKVVTVG 
QFKIGli IHGHQVI P WGDMASIiALLQRQFDVDI L I SGHTHKFEAF 
EHENKFYINPGSATGAYNALETNIIPSFV1*MDIQASTVVTYVYQ 
LIGDDVKVER I E YKKP 


6390 


158 


535 


GEERKEGRAPGKAFAPERNPAKMEKEETTRELLLPNWQGSGSHG 
LTI AQRDDGVFVQEVTQNS PAARTGWKEGDQ I VG ATI YFDNLQ 
SGEVTQLLNTMGHHTVGUCLHRKGDRFFPS LGQTWDP 


6391 


5386 


2897 


VRWNSKTECYIiSIQTQENFPANLNELVNCIVISSLVTTQRJtLKA 
MS LLGS RNQLAPJVVLNPNPMDFCTKDLLTTTS ER I IAYLRDFNE 
DQKKAI ETAYAMVKHS PSVAKI CLIHGPPGTGKS KT I VGLLYRL 



497 



BNSDOCID: <WO_ 



0153312A1_L> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
xesidue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
I»=L»eucine, M^Methionine, N=Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LTENQRKGHSDENSNAKIKQNRVLVCAPSNAAVDELMKKI IIiEF 
KEKCKDKKNPLGNCGDINLVRI^SPEKSINSEVLKFSLDSQVNHR 
MKECELPSHVQAMHKRKEFLDYQLDELSRQRALCRGGREIQRQEL 
DENIS KVS KERQELASKIKEVQGRPQKTQS 1 1 1 LESH I ICCTLS 
TSGG LLLE S AFRGQGG VPFS CV I VDEAGQS CE I ETLTPLIHR CN 
KLILVGDPKQLPPTVISMKAQEYGYDQSMMARFCRLLEENVEHN 
MI SRLPI LQLTVQYRMHPDI CLFPSNYVYNRNIiKTNRQTEAIRC 
SSDWPFQP YliVFDVGDGSERRDNDSYINVQE I KLVME I 1 KLIKD 
KRKDVSFRNIGIITHYKAQKTMIQKDLDKEFDRKGPAEVDTVDA 
FOGRO KDC V I VTCVRANS IOGSIG FLASIiORIiNVTI TR AKY^T, V 
I I»GHLRTLMENQHWNQL I QDAQKRGAI I KTCDKN YRH DAVKI LK 
LKPVLQRSLTHPPTIAPEGSRPQGGLPSSKLDSGFAKTSVAASL 
YHTPSDS KE ITLTVTS KDPERP PVHDQLQDPRLLKRMG I EVKGG 
I FLWD PQ PS S PQHPGATPPTGE PGFPWHQDIiS HVQQ PAA WAA 
LSSHKPPVRGEPPAASPEASTCQSXCDDPEEELCHRREARAFSE 
GEQE KCG S ETHHTRRNSRWDKRTLEQEDS S S KKRKIiL 


6392 


972 


186 


GRTGVDLAS SMAHRLQ I RLIiT WD VKDTLLRLRHPIiGBAYATKAR 
AHGIiEVEPSALEQGFRQAYRAQSHSFPNYGLSHGLTSRQWWLDV 
VLQTFHIiAGVQDAQAVAPIAEQIiYKDFSHPCTWQVLDGAEDTLR 

FrBTRCT.P T Ji VT <?WPTkPPT .Pfl T T fZCT CI DCTJCT^T^/t nnoTrs nnun 
■* Jvuaakjl*/* v x oiv r u jcrCJuc. L» J. ±i\jKjLA^LirUl>t\r i-Jc V Lt L i> UtAtVJ W V 

KPDPR I FQE ALRLAHME P WAAHVGDNYLCD YQGPRAVGMHS FL 

WGPQALDPWRDSVPKEHILPSLAHLLPALDCLEGSTPGL 


6393 


2017 


730 


TGGSKMAAVAT CGSVAASTGSAVATAS KSNVTS FQRRG PRAS VT 
NDSGPRLVS IAGTRPSVRNGQIiIjVSTGLPALDQIiIXSGGIjAVGTV 
LLIEEDKYN I YS PLLFKYFLAEG I VNGHTLL VASAKEDPANILQ 
ELPAPLLDDKCKKEFDEDVYNHKTPESNIKMKIAWRYQLLPKME 

EPCSLTPGYTKLLQF IQNI I YEEGFDGSNPQKKQRNILR IGIQN 
LGSPIjWGDD I CCAENGGNSHSLTKFIiYVLRGIiLRTSLSACI ITM 
PTHLI QNKAI I ARVTTLS D WVGLES FI GSERETNPtiYKDYHGI* 
IHIRQiPRI^IjICI)ESDVKDrAFKLKRKLFTIERLHIiPPDLSD 
TVSRS S KMDLAESAKKW3PGCGMMAGGKKHLDF 


6394 


1418 


511 


GAAAGGEGARRR PAAMATV MAATAAERAVXiE E E FRWLLHDEVHA" 
VLKQLQD I LKEASLRFTLPGSGTEGPAKQENF I I»GS OGTDQVKG 
VI.TLQGDALSQADVNLKMPRNNQLLHFAFREDKQWKLQQIQDAR 
NHVSQAIYLLTSRDQSYQFKTGMVLJO^AVMI^LTRARNRLT 
TPATLTIjPEIAASGLTRMFAPALPSDLLVNVYINI^KIiCIiTVYQ 
liHALQ PNS TKN FRPAGGAVLHS PGAMFEWGSQRLE VSHVHKVEC 
VI PWliNDALVYFT VS LQLCQQLKDKI S VFSS YWS YRP F 


6395 


13 


658 


PSGRPTRPLCCAARRGAARHGGSVSGWPAGRTPTETSNPGS S VM 
ESVTFEDVAVE F I QEWALLDS ARRSLCKYRMLDQCRTLASRGTP 
PCKPS CVSQLGQRAE PKATERG ILRATGVAWESQLKPEELPSMQ 
DLLEBASS RDMQMGPGLFLRMQLVPS I EERETPLTREDRPALQ E 
PPWSLGCTGIiKAAMQIQRWI PVPTLGHRNPWVARDSGE 


6396 


1 


1221 


AN ILS S PSKRGQKGTLIGYS PEGTPLYNFMGDAFQHSSQS I PRF 
I KESLKQI LEESDSRQI FYFLCLNLLFTFVELFYGVLTNSLGLI 
SDG FHMLFDCSALVMGLFAALMSRWKATR I FS YG YGRI E I LSG F 
INGLFL I VI AFFVFMESVARL I DPPELDTHMLTPVSVGGLI VNL, 
IGICAFSHAHSHAHGASQGSCHSSDHSHSHHMHGHSDHGHGHSH 
GS AGGGMNANMRG VFLHVLADTIX3S I G V I VSTVIiI EQ FGWFIAD 
PLCSLFI AI IjI FI»S WPIj I KDACQVLIiLRLPPEYEKELHI ALEK 
I QKI EGL 1 S YRDPH FWRHSAS I VAGT IH IQ VTSD VLEQRI VQQV 
TG I LKDAG VNNLT I QVEKEAYFQHMSGLS TGFHDVLAMTKQME S 
MKYCKDGTYIM 


6397 


391 


122 


GAGGVGRFEAIRAPARMI EWCNDRLGKKVRVKCNTDDTIGDLK 
KLIAAQTGTRWNKI VLKKWYTI FKDHVSLGDYEIHDGMNI»EI*YY 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c or re spending 

to first 

amino acid 

icolUUc OIL 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
I*=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
Q ' 


6398 


353 


1306 


HKQMGPLINRCKKILLPTTVPPATMRIWLLGGLLPFLLLLSGLQ 
RPTEGSEVAIKIDFDPAPGSFDDQYOGCSKQVMEKLTQGDYFTK 
D I E AQKNYFRMWQKAHIiAWl>NQGK VLPQNMTTTHAVAI L FYTLN 
SNVHSDFTRAMASVARTPQQYERSFHFKYLHYYLTSAIQLLRKD 
SIMENGTLCYBVHYRTKDVHFNAYTGATIRFGQFIiSTSLLKEEA 
QEFGNQTLFTlFTCLGAPVQYFSLKKEVIiIPPYELFKVINMSYH 
PRGDWL»QLRSTGNL»STYNCQLLKASSKKCIPDPIAIASIjS FLTS 
VIIFSKSRV 


6399 


75 


1245 


PNLETYFGRRCEKDSMNFTPTHTPVCRKRTWSKRGVAVSGPTK 

rrgmadslestplpspedriaklhpskslleyyqkkmaeceaen 
edllkklelykeacegqhklecdlqqreeeiaelqkalsdmqvc 

LFQEREHVLRLYSENDRLRIRELEDKKKIQNL1ALVGTDAGEVT 
YFCKEPPHKVTILQKTIQAVGECEQSESSAFKADPKISKRRPSR 
ERKESSEHYQRDlQTLIIiQVEALQAQLGEQTKLSREQIEGLIED 
RRIHLEEIQVQHQRNQNKIKELTKNLHHTQELLYESTKDFLQLR 
SENQNKEKSWMLEKDNLMSK I KQYRVQCKKKEDKI GKVLPVMHE 
SHHAQS EYI KVMS LCRNEWYFSGRVEG IPKNLQFVM 


6400 


2520 


1053 


KTMKCDEVVYEVQSAILRHNCGYAMKTGKFFHNLMERKDFETWIi 
DNI S VTFLiSLTDLQ KNETLDIIL I S LSGAVQ LRHLSNNLETLL KR 
DFLKLLPLEDSFYI^KWLDPQTLLTCCLVSKQWNKVISACTEVW 
QTACKNLGWQIDDSVQDALHWKKVYIiKAILRMKQLEDHEAFETS 
SLIGHSARVYAL Y Y KDGLLCTGS DDLS AKL»WDVS TGQCVYG I QT 
HTCAAVKFDEQKLVTGS FDNT VACWE WS SGARTQH FRGHTGAVF 
SVDYrTOELDILVSGSADFTVICVWAI^AGTClJSrrLTGHTEWVTKV 
VLQKCKVKSLLHSPGDYILLSADKYEIKIWPIGREINCKCLKTI, 
SVSEDRSICLQPRLHFDGKYIVCSSALGLYQWDFASYDILRVIK 
TP E I ANLALLGFGD I FALLFDNRYLY IMDLRTESLI SRWPLPE Y 

RKSKRGSSFLAGEASWLNGLDGHNDTGLVFATSMPDHSIHLVLW 
KEHG 


6401 


109 


766 


PGAAWSRPDLRGCCTGPQPALRMLVLPSPCPQPLAFSSVETMEG 
P PRRTCRS PEPG PS SS IGS PQAS S PPR PNHYLLI DTQGVP YTVI* 
VDEESQREPGASGAPGQKKCYSCPVCSRVFEYMSYLQRHSITHS 
EVKPFECDI CGKAFKRASHLARHHSIHLAGGGRPHGCPLCPRRF 
RDAGELAQHSRVHSGERPFQCPHCPRRFMEQNTLQKHTRWKHP 


6402 


1196 


279 


TTSQCGGIRQSSAIPVASMEFAAICIiRNALLLLPEEQQDPKQEN 
GAKNSNQLGGNTESSES SETCSS KSHDGDKFI PAPPSS PL.RKQE 
LENLKCS I IiACSAYVALALGDNLMALNHADKLLQQPKLSGSLKF 
LGHLYAAEALISLDRISDAITHLNPENVTDVSLGISSNEQDQGS 
DKGENEAMESSGKRAPQCYPSSVNSARTVMLFNLGSAYCLRSEY 
DKARKCLHQAAS MI H P KE VPP EA I LLAVYLE IiQNGNTQLALQ 1 1 
KRNQLLPAVKTHSEVRKKPVFQPVHPIQPIQMPAFTTVQRK 


6403 
6404 


2 

1012 


1690 
^22 


RG IHTSVLQGNLQNQMYSHNVVIMNIiNNLNLTQVQQRNLI TNLQ 
RSVDDTSQAIQRIKNDFQNLQQVFLQAKKDTDWLKEKVQS lqtl 
«rt«i>ii>AiiMJ\^uxwLri ijiiUMlMi>QjLjNS FTGQMENI TTISQANEQNLK 
DLQDLHKDAENRTAIKFNQLEERFQLFETDIVNIISNISYTAHH 
LRTLTSNLNEVRTTCTDTLTKHTDDLTSLNNTliANIRLDSVSIiR 
MQQDLMRS RLDTEVANLS VI MEEMKLVDSKHGQIiI KNFT I LQGP 
PGPRGPRGDRGSQGPPGPTGNKGQKGEKGEPGPPGPAGERGPIG 
P AG P PGERGGKGS KGSQG PKGSRGS PG KPGPQGP SGD PG P PGPP 
GKEGLPGPQGPPGFQGLQGTVGEPGVPGPRGLPGLPGVPGMPGP 
KGPPGPPGPSGAWPLALQNEPTPAPEDNSCPPHWKNFTDKCYY 
FSVEKEIFEDAKLFCEDKSSHLVFINTREEQQW I KKQMVGRESH 
WIGLTDSERBNEWKWLDGTSPDYKNVJKAGQPDNWGHGHGPGEDC 
AGLI YAGQWNDFQCEDVNNFI CEKDRETVLSSAL 
AAAIAMAAPAPGLISVFSSSQELGAAIiAQltVAQRAACCIiAGARA 
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SEQ 
ID 
NO * 


Predicted 
beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

L 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 

fil r B/»if1 . P-PhPTivlal an i np RsRl vrin^ 

H=Histidine, I=lsoleucine, K=L»ysine, 

L^Leucine, M=Methionine, N=Asparagine , 

P= Proline, Q-Glut amine, R=Arginine, 

S^Serine, T=Threonine , V=Valine, 

W-Tryp t ophan , Y=Tyrosine, X= Unknown, *=Stop 

Codon, /=possible nucleotide deletion, 

\=possible nucleotide insertion) 








RFALGLSGGSLVSMLARELPAAVAPAGPASLARWTLGFCDERLV 
PFDHAESTYGLYRTHLLSRLP I PESQVITIN PELPVEEAAEDYA 
KKLRQAFQGDS I PVFDLLILGVGPDGHTCS LFPDHPLLQEREKI 
VAP I SDS PKP PPQRVTLTLPVLNAARTVI FVATGBGKAAVLKRI . 


6405 


1 


1456 


AALPRPTPRAPLGREGTGSDSEMAASMFYGRLVAVATIiRNHRPR 
TAQRAAAQVIX3SSGLFmrHGLQVQQQQQRNLSIiHEYMSMEI*LQE 
AGVS VPKG YVAKS PDEAYAIAKKLGS KDWT KAQVIiAGGRG KGT 
FESGLKGGVKIVFSPEEAKAVSSQMIGKKLFTKQTGEKGRICNQ 
VLVCERKYPRREYYFAITMERSFQGPVLIGSSHGGVNIEDVAAE 
TPEAI I KEP IDIEEG I KKEQALQLAQKMGFPPNIVESAAENMVK 
IiYSLir JbKYDATMXE INPMVEDSDGAVX>CMDAK1N t DbjNbAXKyK 
KI FDLQDWTQEDERBKDAAKANLNYI GLDGN IGCLVNGAGLAMA 
TMDI IKIiHGGTPANFIjDVGGGATVHQVTEAFKLITSDKKVIjAII* 
VNI FGGI MRCDVIAQG I VMAVKDLEIKI PVvvRIjQGTRVDDAKA 
LI ADSGLKI LACDDIiDEAARMVVKJjSE I VTLAKQAH VD VKFQLP 
I 


6406 


1036 


167 


HPRQMRGEDTP EAP P YS SGR YDS I KTEVSGCPEDLTVGRAPTAD 
DDDDDHDDHEDNDKMNDSEXSMDPERb^ 

V P I S KQP KEKI QA1 1 E 5 Co RQF P fc. r UE KAK J\K i K 1 1 Jjj&>i.KKMK 
KNGMEMTRPTP PHLTSAMAENI IiAAACES ETRKAAKRMRIjE I YQ 
SSQDEPIALDKQHSRDSAAITHSTYSLPASSYSQDPVYANGGLN 

CRAALGSGMGRGKQRPVMERGCLTA 


6407 


492 


150 


VGLCIAVSQTVLAQLDALLVFPGQVAQLSCTLSPQHVTIRDYGV 
SWYQQRAGSAPRYTiLYYRSEEDHHRPADIPDRFSAAKDEAHNAC 

t/T i , TODtTr>Di?nr>irivvr'o'\7/^vrti?CD 
vJji XoJrVUirr.ULi>*JJX iLaVbiur or 


6408 


1458 


903 


RGC I TSS QAWRLFGG VTRGFNMRI E KCYFCSGP I YPGHGMMFVR 
NDCKVFRFCKS KCHKNFKKKRNPRKVRWTKAFRKAAGKELT VDN 
S FE FEKRRNEP I KYQRELWNKT I DAMKRVEB I KQKRQAKFIMNR 

DVDMEDAP 


6409 


150 


446 


NTAL»ANIjIjRCFTCDR1iCGGCTAPAP pahqg I vlq p vmps CDPGP 
G PACLPTKTFIiS YLPRCHRT YS CVHCilAHIiAKHDELI S KS FQGS 
HGRAYLFNSV 


6410 


85 


607 


RGGTAGCVACLGCWGQSSSPKAAFPAGSACI,PADSCPCLI.FQAC 
AI SGLFNC I TIHPI^IJIAGVWMIMNAFILIiLCEAPFCCQFIEFA 
NTVAEKVDRLRSWQKAVF YCGMAWP I VI S1.TLTTLLGNAI AFA 
TGVLYGLSALG KKGD AI S YARIQQQRQQADEEKLAETIiEGEIj 


6411 


302 


772 


RLS I MASSLNEDPEGSRITYVKGDLFACPKTDSLAHCISEDCRM 
GAG I AVLFKKKFGGVQELIjNQQKKSGE VAVLKRDGRY I YYLI TK 
KRASHKPTYENL»QKS1jEAMKSKCLKNGVTDLSMPRIGCGLDRjUJ 
WENVSAMIEEVFEATDI KI TVYTL 


6412 


61 


1709 


RP VTS F S PLPG S CGGRLGTRTMLGRS I»RE VSAALKQGQ I T PTEL 
CQKCLSLIKKTKFT^AYITVSEEVALKQAEESEKRYKNGQSLGD 
LDG I P I AVKDNFSTSG IETTCASNMLKGYI PP YNAIVVQKLL 
GALLMGKTNLDEFAMGSGSTDGVFGPVKNPWSYSKQYREKRKQN 
PHS ENEDS DWL I TGGSSGGSAAAVSAFTC YAAIiGSDTGGSTRNP 
AAHCGIiVGFKPSYGIiVSRHGI»IPLVNSMDVPGII»TRCVDDAAIV 
I^AI^GPDPRDSTTVHEPINKPFMLPSLADVSKLCIGIPKBYLV 
PELSSE VQSLWSKAADLFES EGAKVI EVSLPHTS YS I VCYHVLC 
TSE VAS NMARFDGLQ YGHRCD I D VS TEAM Y AATRR EG FND WRG 
R ILSGN FFLLKENYEjNTYFVKAQKVRRL. IANDFVNAFNSG VDVLL 
TPTTIiSEAVPYLEFIKEDNRTRSAQDDIFTQAVNMAGIjPAVSIP 
VALSNQGLPIGI^FIGRAFCDQQLLTVAKWFEKQVQFPVIQLQE 
LMDD CS AVLENE KLAS VS t»KQ 
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SEQ 
10 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre s ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K*Lysine, 
I^Leucine, M=Methionine, N=Asparagine, 
?= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V~ Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6413 


2 


885 


HE PRCAGMAASLWMGDLE P YMDENFI SRAFATMGET VMS VKI I R 
NRLTG I PAGYCFVEFADIATAEKCLHKINGKPLPGATPAKRFKL 
N YATYGKQP DNS P E YS LFVGDLTPDVDDGML YE FFVKVYP S CRG 
GKWLDQTGVS KG YG FVKFTDELEQKRALTE CQGAVGLGSKPVR 
LSVAI PKASRVKPVE YSQMYS YS YNQYYQQYQN YYAQWGYDQNT 
GSYSYSYPQYGYTQSTMQTYEEVGDDALEDPMPQLDVTEANKEF 
MEQSEEI»YDALMDCHWQPLDTVSS EI PAMM 


6414 


1 


538 


RGGRAALL PWRRF PCCR PR PQ PAR PS SRATPG PRS PGMATS I GV 
S FSVGDGVPEAEKNAGEPENTY I LRPVPfViR ™ d <z\r\7vr>r>TWA\7 
LKEELANAEYS PEEMPQLTKHLSENI KDKLKEMGFDRYKMWQV 

VIGEQRGEGVFMASRCFWDADTDNYTHDVFMNDSLFCVVAAFGC 
FYY 


6415 


2 


1168 


FVRQWQSSHRRACGIX5CEARAGGGEEPRGRASSVAGWVGAFRAP~~ 
FIEAAVAGLGAGSGKRRRGWKMPVHSRGDKKETNHHDEMEVDYA 
ENEGSSSEDEDTESSSVSEDGDSSSMDDEDCERRRMECLDEMSN 
LE KQ FTD L KDQ L Y KER LS Q VDAKLQ E VI AG KAP E Y LE P LATLQ E 

NMQIRTKVAGIYRELCLESVKNKYECEIQASRQHCESEKLLLYD 
TVOSEI£EKIRRT«KF!'DPH<?TnT r PQT?T.WlsTr»T?T ncDWDvnnfMnn 

KKKPGWSGPYIVYlVniQDLDILEDWTTIRKAMATLGPHRVKTEP 
PVKLEKHLHSARSEEGRLYYDGEWYIRGQTICIDKKDECPTSAV 
I TT I NHDE VWFKR P DGS KSKLYIS QLQKGKYS I KHS 


6416 


410 


1519 


EI APADLE I PACAP VLLSRATSSTMS VTGGKMAPSLTQE I LSHL 

GLAS KTAAWGTLGTLRTFLNFS VDKDAORLLRAITGQGVDRSAI 

VDVLTNRSREQRQLISRNFQERTQQDLMKSLQAAIiSGNIiERIVM 
ALLOPTAOFDAnETiPTAT.TraQno juttw/h tuti n«T»r>«T«TiT*r\T nr>nT 

AVY KHNFQ VE AVDG ITS E TSG I LQDLLLALAKGGRDS YS G I ID Y 
NLAEQDVQALQRAEGPS R EETVfVP VFTQRN PEHIiIRVFDQ YQRS 
TGQELEEAVQNRFHGDAQVALLGLASVIKNTPLYFADKLHQALQ 
ETEPN YQVL I RI LI SRCETDLLS I RAEFRKKFGKS L YSS LQDAV 
KGDCQSALLALCRAEDM 


6417 


1 


845 


RGESR VLWS E LEGEAGG AGG WAS S LNARMDNRFATA FV IACVL S 
LISTI YMAAS IGTDFWYEYRSPVQENSSDLNKS IWDEFI SDBAD 
EKTYNDALFRYNGTVGLWRRCITI PKNMI1WYS PPERTES FDWT 
KCVSFTLTEQFMEKFVDPGNHNSGIDLLRTYLWRCQFLLPFVSL 
GLMCFGALIGLCACI CRSLYPTIATGI LHLLAGLCTLGSVSCYV 
AG I ELLHQKLELPDNVSGEFGWS FCLACVS AP LQ FMAS ALF I WA 
AHTNRKEYTLMKAYRVA 


6418 


2 


662 


TRTRPRRPPGLGAAVGKAGARSTSTPAGASPAAAYQADPPPPAH 
TPAPPPPPPCGGIACHGEPAKFYGYDNLQRQP I FTTQQEAELVQ 
YFDCKSSSGNIGEDPDHLNQSSSPSQMFPWMRPQAAPGRRRGRQ 
TYSRFQTLELEKEFLFNPYLTRKRRIEVSHALALTERQVKIWFQ 
NRRMKWKKENNKDKFPVSRQEVKDGETKKEAQELEEDRAEGIjTN 


I 6419 


1 


973 


PGRPRVRNFDXjNSKS ILQEFFCTRS iqi panrsktamskcpi fp 
MARSISTSGPU)KBDTGRQKI.ISTGSLPATLC^TDSLGLEWHL 
PSPDPVTVPYLSPLVVWKELESLLENEGDHAITVADFVDHHPIV 
FWNLVWYFRRLDLPSNIiPGL I LSSBHCNKYSKI PRHCMSEDS KY 
VL I QMLWDNMKLHQDPGQPLY I LWNAHTQKYPMVHIiliQKSDNS F 
NQELLKSMVKS I KMNDVYGPMSQILETLNKCPHFKRQRSLYREI 

LFLSLVALGRENID idafdke ykmaydrltpsqvksthncdrpp 
STGVMECRKTFGEPYL 


6420 " 


207 


1187 


RKM I D KNQTCGVGQDS VP YM I CL IH I LEE WFG VEQLED YLNFAN 
YLLWVFTPLILLILPYFTI FLLYLTI I FLHI YKRKNVLKEAYSH 
NLWDGARKTVATLWDGHAAVWHGYEVHGMEKI PEDGPALI I F YH 
GAIPIDFYYFMAKIFIHKGRTCRVVADHFVFKIPGFSLLLDVFC 
ALHGPREKCVEILRSGHLLAI S PGGVREALI SDETYNI VWGHRR 
G FAQ VAIDAKVP 1 1 PMFTQN I REGFRSLGGTRLFRWLYEKFRYP 
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SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H-Histidine, I=Tsoleucine, K~Lysine t 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *=*Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








F APMYGG F P VKIjRT YLGDP I P YDPQ I TAEELAE KTKNAVQAL I D 
PL»N 1 MSAJb-b r.ri r H 


6421 


1B44 


362 


WALSLRRQPERMSNKLLSPHPHSWLRSEFKMASSPAVIiRASRL 
YQWSLKS SAQFLGS PQLRQVGQI IRVPARMAATL I LE PAGRCCW 
DE PVR I AVRG LAP EQP VTLRAS LRDE KG ALFQAHAR YRADTLGE 
LDLiERAPAbGGS FAGLEPMGIil»WALE P E KP LiVRIiVKRD VRT PI*A 
VELEVIiDGHDPDPGRLLCQTRHERYFLPPGVRREPVRVGRVRGT 
L FL»PPE PG P FPG I VDMFGTGGGLI*E YRASLLAGKGFAVMALAYY 
N YEDLPKTMETLHLEYFEEAMNYLLSHPEVKGPGVGLLGI SKGG 
ELCLSMAS FLKG ITAAWINGS VANVGGTLRYKGETLPPVGVNR 
NR1 KVTKDGYADIVDVLNSPLEGPDQKSFI pveraestflflvg 
QDDHNWKS E F YANEACKRLQAHGRR KPQ I ICYPETGH YI EPPYF 
PLCRASLHAIjVGSPI I WGGEPRAHAMAQVDAWKQLQTFFHKHLG 
GREGTIPSKV 


6422 


181 


2133 


EGENLSWFQEFWGDIAKEFYWKTPCPGPFLRYNFDVTKGKIFIE - 

WMKGATTN I CY1WLDRNVHEKKLGDKVAFYWEGNEPGETTQITY 

HQLLVQVCQFSNVURKQGIHKGDRVAIYMPMIPELWAMLACAR 

IGALHSIVFAGFSSESLCERILDSSCSIjIilTTDAFYRGEKIiVNL 

KEIJ^EALQKCQEKGFPVRCCIVVKHIiGRAELGMGDSTSQSPPl 

KRSCPDVQ I S WNQGIDLWWHELMQEAGDECE PE WCDAEDPLFI L 

YTSGSTGKPKGWHTVGGYMI.YVATTFKYVF0FHAEDVFWCrAD 

IGW ITGHS YVTYGPIiANGATSVLFEGI PTY PDVNRLWS I VDKYK 

VTKFYTAPTAI RLIJ^KFGDEPVTKHSRASLQVXCTVGE PINPEA 

WLW YHRWGAQRCP I VDTFWQTETGGHMIjTPIjPGATPM kpgs at 

F P FFGVAPAI LNESGEELEGEAEG YLV FKQP WPG IMRTVYGNHE 

RFETTYFKKFPGYYVTGDGCQRDQDGYYWITGRIDDMIiNVSGHL 

LSTAEVESAI,VEHEAVAEAAWGHPHPVKGECIiYCFVTLCDGHT 

FSPKLTEELKKQIREKIGPIATPDYIQNAPGLPKTRSGKIMRRV 

LRKIAQNDHDLGDMSTVADPSVISHLFSHRCLTIQ 


6423 


614 


1237 


ANI#KE I PRDI*P PETVXJj YIjDSNQ ITS X PNE I FKDLHQLR VLNLS' " 
KNG I E F I DEHAFKGVAETLQTLDLS DNRIQS VHKNAFNNLKARA 
RIANNPWHCDCTIjQQVLRSMASNHETAHNVI CKTSVkDEHAGRP 
FtiNAANDADLCNLPKKTTDYAMI»VTM FGWFTMVI S YWYYVRQN 
QEDARRHLEYkKSLPSRQKKADEPDDISTW 


6424 


X 


1188 


KKVS WPVAAMVHCSCVI»FRKYGNFI DKLRLFTRGGSGGMGYPRL 
GGEGGKGGDVWWAHNRMTIjKQIfKDRYPRKRFVAGVGANSKISA 
LKGSKGKDWEI PVPVGISVTDENGKI IGELNKENDRILVAQGGL 
GGKl.I.TNFLPLKGQKRIIHLDLKIilADVGLVGFPNAGKSSLLSC 
VSHAKPAI AD YAFTTTjKPELGKIMYS DFKQI S VADL PGLI EGAH 
MNKGMGH KFLKH I ERTRQIjLFWD I S GFQLSSHTQ YRTAFETI I 
LLTKEI^LYKEEIiQTKPALIiAVNKMDLPDAQDKFHELMSQI,QNP 

1TT^l?T.U'T..lS , t?VKTM T OTTDnrDtTMlTTTlT O T\ t nv~»C/^ T Tyc»Y VMnrnvnT 

lUJr JjrLLtr ili\sxm x tr v£.f yHJ-Xf loAV lulliViJ.J£E.XjKNCIRKSIi 
DEQANQENDALHKKQIiLNLW I S DTMSSTEPPS KHAVTTSKMD 1 1 


6425 


1850 


1144 


LAMEGGGGIPLETIjKEESQSRHVLPASFEVNSLQKSNWGFLLTG 

lvggtlvavyavat pfvtpalrkvclpfvpatm kq i enwkmlr 
crrgslvdigsgdgriviaaaxkgftavgyelnpwlvwysryra 

WREGVHGSAKFY 1 SDLWKVTFSQYSNWI FGVPQMMLQLEKKLE 
RELEDDAR VIACR FP FPHWTPDHVTGEG IDTVWAYDASTFRGRE 
KRPCTSMHFQLPIOA 


6426 


30 


565 


SRGAAVGGMSVAGGEIRGDTGGEDTAAPGRFS FS PEPTLED I RR 
LHAEFAAERDWEQFHQPRNLLLALVGEVGELAELFQWKTDGEPG 
PQGWS PRERAALQE ELSDVjL I YI»VAI*AARCRVDIiPLAVLSKMDI 
NRRRYPAHLARSSSRKYTELPHGAISEDQAVGPADI PCDSTGQT 
ST 


6427 


145 


959 


AAS WGP PH V PKAGKMVSWMI CRLVVIfVFGMLCPAYAS YKAVKTK 
NIRE YVRWMMYWI VFALFMAAE I VTD I F I S W FP FYY E I KMAFSTL 
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SEQ 
ID 
NOc 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide" 
(A= Alanine, C=Cysteine; D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=L.eucine, M-Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S-Serine, T=Threonine , V=,Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








WLLS P YTKGAS I>L> YR KFVHPSI»SRHE KE I DA Y I VQAKERS YETV~ 

LS FG KRGLN IAAS AAVQAATKSQGALi AGRLRS FSMQDLRS I S DA 

PAPAYHDPLYLEDQVSHRRPPIGYRAGGIiQDSDTEDECWSDTEA 

VPRAPARPREKPLIRSQSLRVVKRKPPVREGTSRSLKVRTRKKT 
VPSDVDS 


6428 


1982 


444 


SGSGGKMEDHQHVPIDIQTSKLLDWLVDRRHCSbJCWQSLVLTIR " " 
EKINAAIQDMPESEEIAQLLSGSYIHYFHCLRILDLLKGTEAST 
KNI FGRYSSQRMKDWQEI IALYEKDNTYLVELSSLLVRNVNYEI 
P S LKKQ I AKCQQLQQEYS RKEEECQAGAAEMR EQF YHS C KQYG I 
TGENVRGELIALVKDLPSQLAEIGAAAQQSLGEAIDVYQASVGF 
VCESPTEQVLPMLRFVQKRGNSTVYEWRTGTEPSWERPHLEEI, 
PEQVAEDAIDWGDFGVEAVSEGTDSGISAEAAGIDWGIFPESDS 
KDPGGDG IDWGDDAVALQ I TVLEAGTQAPEG VARGPDALTLLEY 
TETRNQFLDELMELEI FIAQRAVELSEEADVLS VSQFQLAPAI L 
QGQTKEKMVTKTV'SVLEDLIGKIjTSLQLGHI.FMlLASPRYVDRVT 
EFr^QKIiKOSQLLAIiKKELMVQKQQEALEEQAAUEPKLDLLLEK 
TKEIiQKLI EADI S KR YSGRPVNLMGTSL 


6429 


3413 


3442 


EPSSWTAAPRGPLAAHPLEAAVQEDDRRALSFDSRIKVFANGTt " 
WKSVTDKDAGDYLCVARNKVGDDYWLKVDWMKPAKIEHKEE 
NDHK VFYGGDLKVDCVATGJLi PNPE I S WS LPDG S LVNS FMQSDDS 
wvxvAAAt v v r »iNUi.ux r«c»vvjiriKcilivylJz ICf AENQVGKDEMRVR 
VKWTAPATIRNKTCIAVQVPYGDVVTVACEAKGEPMPKVTWLS 
PTNKVIPTSSEKYQXYQDGTLLIQKAQRSDSGNYTCIiVRNSAGE * 

drktvwihvnvqppkingnpnpittvreiaaggsrklidckaeg 

IPTPRVLWAFPEGWLPAPYYGNRITVHGNGSLDIRSLRK5DSV 

qlvcmari^ggearlivqltvlepmekpifhdpisekitamagh 
tislncsaagtptpslvwvlpngtdlqsgqqlqrfyhkadgmlh 
i sglssvdagayrcvarnaaghterlvslkvglkpeankqyhnl 

VSIINGETIiKIjPCTPPGAGQGRFSWTLPNGMHLEGPQTLGRVSL 

ldngtltvreas vfdrgtyvcrmeteygpsvts I PVI viayppr 

TTSEPTPVI YTRPGNTVKLNCMAMGI PKADITWELPDKSHIiKAG 
VQARL YGNRFLHPQGSIjTI QHATQRDAGF YKCMAKNI DGSDS KT 
TYIHVF 


6430 


1946 


602 


RTRVSTGJbRRTLLWSEAVGASSTRGDTGIPGSGEGGAGPGGGEG " 

amleamaepspedppptlkpetqppekrrrtiedfnkfcsfvla 

YAGYI PPSKEESDWPASGSSS PJLiRGESAADSDGWDSAPSDliRTI 

qtfvkkaksskrraaqagptqpgpprstfsrlqapdsatllekm 

KLKDSI»FDLDGPKVASPLS PTSLiTHTSR PPAALTPVPLSQGDIiS . 
HPPRKKDRKNRKLGPGAGAGFGVL,RRPRPTPGDGEKRSRIKKSK 
KRXLKKAERGDRLPPPGPPQAPPSDTDSEEEEEEEEEEEEBEMA 
T WGG EAPVPVLPT P PEAPRPPATVHPEGVPPADSES KEVGS TE 
TSQDGDAS SSEGEMRVMDEDI MVESGDDS WDI*ITC YCRKPFAGR 

PMIECSLCGTWIHLSCAKIKKTNVPDFFYCQKCKELRPEARRLG 
GPPKSGEP 


6431 


3 


605 


WWNSSYNLPAYAPYLPCEACAMQDGRKGGAYAGKMEATTAGVGR" 
LEEEALRRKERLKALREKTGRKDKEDGEPKTKHLREEEEEGEKH 
REL»RI*RN YVP EDEDLKKRRVPQAKP VAVEE KVKEQLEAAKPBP V 
IEEVDIANIiAPRKPDWDIJCRDVAKKLEKLKKRTQRAlAELIRER 
LKGQEDSLASAVDAATEQKTCDSD 


6432 


56 


1692 I 


GGLiGTMGSRIKQNPETTFEVYVEVAYPRTGGTI>SDPEVQRQFPE 
D YSBQE VLQTTjTKFCFP FYVDSLTVSQ VGQNFTFVLTD I DS KQR 
FGFCRLSSGAKSCFCH^SYLPWFEVFYKLIiNILADYTTKRQENQ 
WNELLETLHiCLPI PDPGVSVHLS VHS YFTVPDTRELPS I PENRN 
LTEYFVAVDVNNMLHLYASMLYERRILI I CSKLSTLTACIHGSA 
AMLYPMYWQHVYIPVI,PPHLIJ)YCCAPMPYLrGIHIiSLMEKVRN 
MALDDWlLNVDTNTIiETPFDDLQSLPNDVISSLKNRLKKVSTT 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E~ 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, Islsoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T^Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=*Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TGDGVARAFLKAQAAFFGSYRNALKIEPEBPITFCEEAFV3HYR 
SGAMRQFLQNATQIiOLFKQFIDGRLDLLNSGEGFSDVFEEEINM 
GE YAGSDKL YHQWLS TVRKGSG A I LNTVKTKANPAMKTVYKFB I 
AENGCAPTPEEQLPKTAPSPLVEAKDPKLREDRRPITVHFGQVR 
PPRPHWKRPKStflAVEGRRTSVPSPEQNTIATPATLHILQKSI 
TH FAAKFPTRGWTSSSH 


6433 


1524 


484 


APVTKRKEVFAKDSKGSALDAGRDPKRPALPETLCESGWASNTA 
PTTPPQPGWCLCGKDFKSSCQTPGREKERRLATMHGSCSFLMLL 
LPLLLLIiVATTGPVGALTDEEKRLMVELHNLYRAQVSPTASDML 
HKRVfDEELAAFATCAYARQCVWGHNKERGRRGENLFAITDEGMDV 
PLAME EWHHEREH YNLS AATCSPG QMCGHYTQWWAKTERI G CG 
SHFCEKLQGVEETNIELLVCNYEPPGNVKGKRPYQEGTPCSQCP 
SGYHCKNSLCEP I GSPEDAQDLPYLVTEAPSFRATEASDSRKMG 
AEGPDKPSVVSGLNSGPGHVWGPIiLGIjIjLIjPPLVLAGIF 


6434 


40 


2002 


MPQLNFGMADPTQMGGIiSML.LIAGEHALGTPEVFSGTCRPDVSE 

S PELRQKSPLFQFAEIS sstshsdastkqcotsalfqfaeissn 

TSQLGGAEPVKRCGKSAIiFQLAEMCLASEGMKMEES KLI KAKES 
DGGR I KE LEKGKEEKE I KME KTDETRI»QKEAE FEKS AKENL»RDS 
KELRNFEALQIDDIMAIKMEDPKEIRKEELEEDHKCSHFPDFSY 
SASSKI I ISDVPSRKDHMCHPHGIMI IEDPAALNKPEKLKKKKK 
KSKMDRHGNDKSTPKKTCKKRQSSESD I ESVI YT I EAVAKGDWG 
IEKIX3DTPRKKVRTSSSGKGSILDAKPPKKKVKSREKKMSKEKS 
SDTTKESRPPDFI S I SASKN I SGETPEGI KAEPLTPMEDALPPS 
LSGQAKPEDSDCHRKI ETCGSRKSERS CKGALY KTLVSEGMLTS 
LRANVDRGKRSSGKGNSSDHEGCWNEESWTFSQSGTSGSKKFKK 
rKPKEDCLLGSAKLDEEFEKKFNSLPQYSPVTFDRKCVPVPRKX 
KKTGNVSSEPTKTSKGSGDKWSNKQLFLDAIHPTEAIFSEDRNT 
MEPVHKVKNTPSIFNTPEPTTTARTFGGQPKEKSKENPDYSPCQ 
DTQRAGYHHEEVLWMTNIJWNCGGVYLKQLRHTAMTNA 


6435 


2227 


657 


AI^RCAAAAYAHPEYEERFLQEETVSQQINSIELLQTRPLAIiPE 
VVKSQRPLQRQVHLRGRPASQPTVIRGITYYKAKVSEEENDIEE 
QQDE FFSGDNGVDbli I EDQLURHNGLMTS VTRRPAATRQGHSTA 
VTSDLNARTAPWSSAJLPQPSTSDPSIANHASVGPTLQTTSVSPD 
PTRESVLQPSPQVPATTVAHTATQQPAAPAPPAVS PREALMEAM 
HTVPVPPTTVRTDSLGKDAPAGRGTTPASPTLSPEEEDDIRNVI 
GRCKDTLSTI TGPTTQNTYGRNEGAWMKDP1AKDERIYVTNYYY 
GNTLVEFRNLENFKQGRWSNSYKLPYSWIGTGHWYNGAFYYNR 
AFTRNI I KYDLKQRYVAAWAMLHDVAYEEATPWRWQGHSDVDFA 
VDENGLWLIYPALDDEGFSQEVIVLSKLNAADJjSTQKETTWRTG 
LRRNFYGNCFVICGV1»YAVDSYWQRNANISYAFDTHTNTQIVPR 
LLFENEYFYTTQIDYNPKDRLLYAWDNGHQVTYHVI FAY 


6436 


1295 


341 


GACRPPVRQDPDSGPDYEALPAGATVTTHMVAGAVAGIljEHCVM 
YP I DCVKTRMQS LQPDPAARYRNVLE ALWR 1 1 RTEGLWRPMRGL 
NVTATGAG PAHALYFAC YEKLKKTLSDVIHPGGNSH I ANGAAGC 
VATLIiHDAAMNPAEVVKQRMQMYNSPYHRVTDCVRAVWQNEGAG 
AFYRS YTTQIiTMNVP FQA IHFMTYEFLOEH FNPQRRYNPSSHVL 
SGACAGAVAAAATT P LD VCKTLLNTQ ES I . ALNS H I TGH I TGMAS 
AFRTVYQ VGG VTAY FRG VQARVI YQI PSTAI AWS VYE FFKYLI T 
KRQEEWRAGK 


6437 


1B2B 


360 


P PAPAPPAS PAR HVTRTARGHIiEGGSRAP PLLOAVFIiQ I XNMVK 
LIHTLADHGDDWCCAFSFSIiLATCSIiDKTIRLYSLRDFTELPH 
SPLKFHTYAVHCCCFSPSGHILASCSTTK3TTVLWNTENGQMIAV 
MEQPSGS PVRVCQFS PDS TCLASGAADGTWLWNAQS YKLYRCG 
SVKDGSLAACAFS PNGS FF VTGS S CGDLTVWDDKMRCLKSEKAH 
DLG ITCCDFSSQP VSDGEQGLQFFRLAS CGQDCQVKI W I VSFTH 
ILGFELKYKSTLSGHCAPVUVCAFSHDGQMLVSGSVDKSVIVYD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine 
H=Histidine, I- I sol eu cine, K=Lysine, 
L=Iieucine, M=Methionine, N=Asparagine, 
P= Proline, Q^Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X- Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNTENILHTLTQHTRYVTTCAFAPNTLLLATGSMDKTVNIWQFD 
LETLCQARSTEHQLKQFTEDWSEEDVSTWLCAQDLKDLVGIFKM 
NN I DGKELLNLT KES IiADDLKI ES LGLRSKVL RKI EELRTKVKS 

LSSGIPDEFICPITRELMKDPVIASDGYSYEKEAMENWDPAKRN 
RTSPP 


6438 


109 


901 


EVQ I liRAKMFQTGGLI VFYGLLAQTMAQFGGLPVPLiDQTLPIjNV 
NPAIiPLSPTGLAGSLTNALSNGLLSGGLLGILENLPLLDIUKPG 
GGTSGGLIXJGIiLGKVTS VI PGLNNI IDIKVTDPQLLELGLVQSP 
DGHRLYVTI PLG I KLQ VNTPLVGAS LLRLAV KLD I TAE I LAVRD 
KQER I HL VLGDCTHS PGS LQ I SLLDGLGPLP I QGLLDS LTG I LN 
KVLPE LVQGNVCPL VNE VLRGLD I TLVHDI VNML I HGLQFVI KV 


6439 


23 


412 


S IQTASAITTEMASQSQG I QQLLQAEKRAAEKVADARKRKARRL 
KQAKEEAQMEVEQ YR REREHE FQS KQQAAMGS QGNLSAEVEQAT 
RRQVQGMQSSQQRNRERVLAQLLGMVCDVRPQVHPNYRISA 


6440 


3 


S17 


<vru\r«XMOut JU»L/Jjf L»Jj VK-L»£> IALiK j.y FWiXyJb'vr x KVDGQRFGQNRT 

IKLLTGSS YKVEVKI KPSTLQVENIS IGGVLVPLELKSKEPDGD 
RWYTGT YDTEGVTPTKSGERQP IQI TMPFTD IGTFETVWQVKF 
YNYHKRDHCQWGS PFSVI E YECKPNETRSLMWVNKESFIj 


6441 


234 


1373 


KSGGLRRRQRPGRSAAVGEEELPPGMEKFKAAMLLGSVGDAljGY 
RNVCKENSTVGMKIQEELQRSGGLDHLVLSPGEWP VSDNT I MH I 
J. rtXirtii iiuihl LiULJLt X Kr. W V KCYVE I VEKIj PER R PDPAT I EGC 
AQLKPNNYLLAWHTPFNEKGSGFGAATKAMCIGLRYWKPERliET 
LIEVSVECGRMTHNHPTGFLGSLCTALFVSFAAQGJCPDVQWGRD 
MLRAVPLAEEYCRKTIRHTAEYQEHWFYFEAKWQFYLEERKISIC 
DSENKA I FPDNYDAEEREKT YRKWS S EGRGGRRGH DAPMI AYDA 
LLAAGMSWTRT.PPR&MFT;nf:pc;aiTr"rT »r , r , T t y vnr tmt «*-r-.<~ 

jjjjru-iuiio r» i cuv_ji it^U v llT MVJVjljO/iAHj 1 X AUt-JUr C> JjJj I GliDLVP K 

GLYQDLEDKEKIiEDLGAALYRLSTEEK 


6442 


34 


796 


AEDPAGGLAGQDTMFARGLKRKCVGHEEDVEGAIiAGLKTVSSYS 
LQRQSI^MSLVKLQLCHMLVEPNLCRSVLIANTVRQIQEEMTQ 
DGTWRTVAPOAAERAPLJ3RIjVSTFTT.rT?&RWr:nvf'aTJDaor'T nn 

GHTQGP VSDLCPVTSAQAPRHLQSSAWEMDGPRENRGS FHKSLD 
QIFETLETKNPSCMEELFSDVDSPYYDLDTVLTGMMGGARPGPC 
EGLEGLAPATPGPSSSCKSDIjGELDHWEILVET 


6443 


2 


555 


MASPAASSVRPPRPKKEPQTIiVI PKNAAEEQKL.KLERLMKNPDK 
AVPIPEKMSEWAPRPPPEFVRDVMGSSAGAGSGEFHVYRHLRRR 
EYQRQDYMDAMAEKQKLDAEFQKRLEKNKIAAEEQTAKRRKKRQ 
KLtKEKKLLAKKMKJjEQKKQEG PGQPKEQGSSS SAEAS GTEEEEE 
VPSFTMGR 


6444 


390 


899 


GSTPRGKMRAPIPEPKPGDLIEIFRPFYRHWAIYVGDGYVVHIA 
P PS EVAGAGAAS VMS ALTDKAI VKKELLYDVAGSDKYQ VNNKHD 
DKYSPLPCSKI I QRAEELVGQE VL YKLTS ENCEHFVNEIiR YGVA 
RSDQVRDVI I AAS VAGMGLAAMS L I GVMFSRNKRQKQ ! 


6445 


2 


753 


AGAAGAAGAARSPRPQAHTKGVRGLPSRRRSPDCGRMELAAGSF" 
SEEQFWEACAELQQPAIAGADWQLLVETSGISIYRLLDKKTGLY 
BYKVFGVLEDCSPTLLADIYMDSDYRKQWDQYVKELYEQECNGE 
TVVYWEVKYPFPMSNRDYVYIiRQRRDLDMEGRKIHVILARSTSM 
PQLGERSGVIRVKQYKQSriAIESDGKKGSKVFMYYFDNPGGQIP 
SWLINWAAKWGVPNFIjKDMARACQNYLKKT 


6446 


1 


1651 


RCPTRSPPPDTPGSRGTTAMCSIASGATGGRGAVENEEDLPELS 
DSGDEAAWEDEDDADLPHGKQQTPCLFCNRLFTSAEBTFSHCKS 
EHQFNIDSMVHKHGLEFYGYIKIilNFIRLKNPTVEYMNSIYNPV 
PWEKEEYLKPVLEDDLI>IjQFDVEDIiYEPVSVPFSYPNGLSENTS 
WE KLKHME ARAIiSAEAAIiARARE DLQKMKQFAQDFVMHTDVRT 
CSSSTSVIADLQEDEDGVYFSSYGHYGIHEEMLKDKIRTESYRD 
FI YQNPH I FKDKWIiDVGCGTGl LSMFAAKAGAKKVLGVDQSEI 
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SEQ 
ID 

NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


" Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
D=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








bYQAMDI IRLNKLEDTITLI KGKIEEVHLPVEKVDVI ISEWMGY 
FLLFESMJbDSVLYAKNKYLAKGGSVYPDICTISLVAVSDVNKHA 
DR IAFWDDVYGFKMS CMKKAVI PEAWEVLDPKTL I SEPCGI KH 
I DCHTTS I S DLEFS SDFTLK I TRTS MCTA I AG YFD 1 Y FEKNCHN 
RWFSTGPQSTKTHWKQTVFLLEKPFSVKAGEALKGKVTVHKNK 
KDPRS LTVTLTLNNSTQT YGLQ 


| 6447 


1554 


1068 


RliGPAEWHLSGPCHATIiGAANRGRALGVRAAWRGAPLCQRVMMP 
SRTNIATGI PSS KVKYSRLSS TDDG Y I DIiQFKKTPPK I PYKAIA 
LATVL Fl> I GAFL 1 1 IGSLLLSGYI SKGGADRAVPVLI IGILVFL 
PG F YHLR I AY YAS KG YRG YS YDD I PD FDD 


6448 


74 


559 


GQVLSHCYHYRSSRWRRGGLSRGRGAGVMAbVPYEETTEFGLQK 
FHKPLATFS FANHTIQ IRQDWRHLGVAAWWDAAI VLSTYLEMG 
AVELRGRSAVELGAGTGLVGI VAALLACR I RYERDNN FLAMLER 
QFIVRKVHYDPEKDVHIYEAQKRNQKEDL 


6449 


597 


1876 


EYG VCEN LRKLE I TG VSCRD V YAKL»LHR YRH I LGLWQ PD IGP YG 
GLLNWVDGLFI IGWM YL P PHD PHVDD PMRFKPLFR I HLMERKA 
ATVECMYGHKGPHHGHIQIVKKDEFSTKCNQTDHHRMSGGRQEE 
FRTWLREEWGRTLEDIFHEHMQELILMKFIYTSQYDNCLTYRRI 
YI*P PSRPDDLIKPGLFKGTYGSHGLEI VMLS FHGRRARGTKITG 
DPNI PAGQQTVEIDLRHRI QLPDI>ENQRNFNELSRI VLEVRERV 
RQEQQEGGHEAGEGRGRQGPRESQPSPAQPRAEAPSKGPDGTPG 
EDGGE PGDAVAAAEQ P AQCGQGQP FVLPVGVS SRNEDYPRTCRM 
CFYGTGL I AGHGFTS PERTPGVF I I*FDEDRFGFVWLELKS FSLY 
SRVQATFRNADAPSPQAFDEMLKNI QSLTS 


6450 


84 8 


269 


FVP APRTVSG KR£J LFGE WEERGEGEQRTGRE FS GNGGRAVEAAR 
MRI>L CGLWIiVI LS LLKVLQAQTPTPIjPLPPPMQS FQGNQFQGEWF 
VLGLAGNSFRPEHRALLNAFTATFEbSDDGRFEVWNAMTRGQHC 
DTWS YVL I PAAQPGQFTVDHRVWTHEQAGR PQDQPAGQELVAAS 
RDAGPVHLPGQSSGPLG 


6451 


232 


939 


HSPTPPTSPRASTMEDVKLEFPSLPQCKEDAEEWTYPMRREMQE 
ILPGJJFLGPYSSAMKSKLPVLQKHGITHIICIRQNXEANFIKPN 
FQQL FRYLVLDIADNP VEN 1 1 RFF PMTKEFIDGS LQMGGKVLVH 
GNAG I SRSAAFVI AYIMETFGMKYRDAFAYVQERRFCINPNAGF 
VHQLQEYEAIYLAKIiTIQMMSPLQIERSLSVHSGTTGSLKRTHE 
EEDDFGTMQYATAQNG 


6452 


1 


652 


RTRGESSNME PLAA YPI>KCSG PRAKVFAVLLS I VLCTVTLFLCq"" 
LKFLKPKINS FYAFE VKDAKGRTVS LEK YKGKVSLWNVASDCQ 
LTDRNYLGLKELHKEPGPSHFSVLAFPCNQFGESEPRPSKEVES 
FAR KN YG VTFPI FHKIJKI LGS EGEPAFRFLVDSS KKE PRWNFWK 
YLiVNPEGQWKFWRPEEP IEVIRPDIAALVRQVI I KKKEDL 


6453 


827 


223 


HRR WLPGLSMS PRRTLPRPLSLCLSLCLdjCLAAALGSAQSGS C 
RDKKNCKVVFSQQEIjRKRLTPIjQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGV7PSFHDVINSEAITFTDD 

fsygmhrvetscsqcgahlghifddgprptgkrycinsaalsft 
padssgtaeggsgvas paqadkaeii | 


6454 


827 


223 


HRRWLPGI^MSPRRTLPRPLSLCJjSI^LCIiCLAAALGSAQSGSC 
RDKKNCKWFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTGKRYCIWSAALSFT 
PADSSGTAEGGSGVAS PAQADKAEL 


6455 


1042 


173 


KVHI^ATVSASAAWDALGLPVRSHMQGSTRRMGVMTDVHRRFLQL " 
LMTHGVLEEWDVKRLQTHCYKVHDRNATVDKIjEDFIMN'INSVLE 
S Ij Y I E I KRG VTEDDGRP I YAL VNLATTS I S KMATD FAE NELDLF 
RKALEL 1 1 DSETGFAS STN I LNLVDQLKGKKMRKKEAEQVLQKF 
VQNKWLIEKEGEFTLHGRAILEMEQYIRETYPDAVKICNICHSIj 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

w \j i. ±. cD^yuiiuxiiu 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine / C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F~ Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine r X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIQGQSCETCGIRMHLPCVAKYFQSNAEPRCPHCNDYVJPHEIPK 
VFDPEKERESGVLKSNKKSLRSRQH 


6456 


2 


555 


RPQSRSISMWRNSLLQVSSGLRWLRVCAMVDILGERHLVTCKGA 
TVEAEAALQNKWALYFAAARCAPSRDFTPLLCDFYTALVAEAR 
RPAPFEWFVSADGSSQEMLDFMRELHGAWLALPFHDPYRHELR 
KR YNVTA I P KLVI VKQNGEVI TNKGRKQ I RERGLAC FQD W VEAA 
DIFQNFSV 


6457 


23 


892 


PTTG FP VTN F PWN W PDG K P P I M I L YVS KLNK 1 1 HFP D FDKKI P V 
KLFPLPLLWGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
1 1 LGKQYSLNI I LS VFAI I LGAFI AAGSDLAFNLEGY I FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLFYNACFMI I PTLI I S VSTG 
DLQQATEFNQWKNWFILQFLLSCFLGFLI^YSTVLCSYYNSAL 
TTAWGAI KNVSVAY IGI LIGGDYI FSLLNFVGLNI CMAGGLiR Y 
SFLTLSSQLKPKP VGEENI cldlks 


6458 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGIiSSTSKLSLPMFTVLRKFTIPLTLIiLET 
1 1 LGKQYSLNI ILSVFAI I LGAFI AAGSDLAFNLEGYI FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLFYNACFMI I PTLI IS VSTG 

dlqqatefnqwknvvfilqfllscflgfllmystvlcsyynsal 
ttawgaiknvsvayigiliggdyifsllnfvglnicmagglry 
sfltlssqlkpkpvgbeni cldlks 


6459 


23 


892 


pttgfpvtnfpwnwpdgkppimilyvsklnkiihfpdfdkkipv 
klfplpllyvgnhi sgls stsklslpmftvlrkfti pltlllet 
1 1 lgkqyslni ilsvfai i lgaf i aagsdlafnlegy i fvflnd 
i ftaangvytkqkmdpkelgkygvlfynacfmi i ptli i svstg 
dlqqate fnqwknwfi lq flls cflgfllmys tvlcs yyns al 
ttawgai knvsvay i g i l i ggd y i fs llnf vgln i cmagglr y 

SFLTLSSQLKPKP VGEENICLDLKS 


6460 


23 


B92 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHI SGLS STSKLSLPMFTVLRKFTI PLTLLLET 
1 1 LGKQYSLNI I LSVFAI I LGAFI AAGSDLAFNLEG YI FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLFYNACFMI I PTLI ISVSTG 
DLQQATE FNQW KNWF I LQFLLSCFLG FLLMYSTVLCS YYNS AL 
TTAWGAI KNVSVAYIGIL IGGDYI FS LLNFVGLNICMAGGLR Y 
SFLTLSSQLKPKP VGEEN I CLDLKS 


6461 


1653 


360 


LQQRTLRITAVGQTHPIAWMAWEPSLGAFYGPASFITFVNCMYF 
LS I FIQLKRHPERKYELKEPTEEQQRLAANENGE INHQDSMSLS 
LISTSALENEHTFHSQLLGASLTLLLYVALWMFGALAVSLYYPL 
DLVFS FVFGATSLS FS AFF WHHCVNRED VRLAW I MTCCPGRS S 
YS VQVNVQP PNSNGTNGEAP KC PNS SAESS CTNKSAS S FKNS SQ 
GCKLTNLQAAAAQCHANSLPLNSTPQLDNSLTEHSMDND I KMHV 
APLEVQFRTNVHS SRHHKNRS KGHRASRLTVLREYAYDVPTS VE 
GSVQNGLPKSRLGNNEGHSRSRRAYLAYRERQYNPPQQDSSDAC 
STLPKSSRNFEKPVST/TSKKDALRKPAVVELENQQKSYGLNLAI 
QNGPI KSNGQEGPLLGTDSTGNVRTGLWKHETTV 


6462 


3 


773 


SEELDREKKLKEDS PRKTPNKESG VPSLP VS LTS I KEE PKEAKH 
PDSQSMEESKLKNDDRKTPVNWKDSRGTRVAVSSPMSQHQSYIQ 
YIjHAYTYPQMYDPSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYG 
KMSGREETEKVNTS PSVNTKTTTESKALDLLQQHANQYRSKS PA 
P VEKATAER EREAE RE RDRH S P FGQRHLHTHHHTHVG MG Y P L I P 
GQ YDP FQGLTSAALVASQQVAAQAS AS GM FPGQRRE 


6463 


2 


350 


V I LC I LGG W I FKNADRSMEKKKGEPRTRAE AR P WVDEDLKDS S D 
LHQAEEDADEWQESEENVEHIPFSHNHYPEKEMVKRSQEFYELL 
NKRRSVRFISNEQVPMEVIDNVIRTAGL 


6464 


12 


1154 


G I LRQKEREERNR IHKKEI LFLEHLLWPSEMSSLSGKVQTVLG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H-Histidine, I=isoleucine, K= Lysine, 
L^Leucine, Methionine, N=Asparagine, 
P=rProline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X^Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LVEPSKLGRTLTHEHLAMTFDCC YCPPPPCQEAI S KEPI VMKNL 
YW IQKNAYSHKENLQLNQETEAI KEELLY FKANGGGALVENTTT 
GI SRDTQTLKRLAEETGVHI I SGAGFYVDATHSSETRAMS VEQL 
TDVLMNE I LHGADGTS I KCGI IGE IGCSWPLTESERKVLQATAH 
AQAQLGCPVI IHPGRSSRAPFQI IRILQEAGADISKTVMSHLDR 
TI LDKKELLE FAQLGC YLEYDLFGTELLHYQLGPDI DMPDDNKR 
IRRVRLLVEEGCEDRILVAHDIHTKTRLMKYGGHGYSHILTNW 
PKMLLRG I TENVLDKI LI ENPKQWLTFK 


6465 


126 


1396 


QEAQVFGNQLI PPN AQVKKATVFLNPAACKGKARTLFE KNAAP I 
LHLSGMDVTIVKTDYEGQAKKLLELMENTDVIIVAGGDGTLQEV 
VTGVLRRTDEATFSKIPIGFIPLGETSSLSHTLFAESGNKVQHI 
TDATLAI VKG ETVPLD VLQIKGEKEQP VFAMTGLRWGS FRDAG V 
KVSKYWYLEPLKI KAAHFFSTLKEWPQTHQAS ISYTGPTERPPN 
EPEETPVQRPS L YRR I LRRLiAS Y WAQ PQDALS OEVS PE VWKDVQ 
LSTI ELS I TTRNNQLDPTSKEDFLNI CIEPDT I SKGDFI TIGSR 
KVRN P KLHVEGTECLQAS QCTLL I PEG AGGS FS I DSEE YEAMP V 
EVKLLPRKLQFFCDPRKREQMLTSPTQ 


6466 


1134 


828 


VARGTELSQLEKAHPPADMGRRKSKRKPPPKKKMTGTLETQFTC 
P FCNHEKS CD VKMDRARNTG V I S CTVCLEE FQT P I TY LS E P VD V 
YSDWIDACEAANQ 


6467 


301 


2571 


GELR VLALAHGELACHAVLTAS LLS LRSRLMDS DMD Y ER PNVET 
IKCVWGDNAVGKTRL I CARACNATLTQYQLLATHVPTVWAIDQ 
YRVCQEVLERSRDWDDVSVSLRLWDTFGDHHKDRRFAYGRSDV 
WLCFSIANPWSLHHVKTMWYPEIKHFCPRAPVILVGCQLDLRY 
ADLEAVNRARRPLARPIKPNEILPPEKGREVAKELGIPYYETSV 

t/3V nor 1 T i/TM rT?r\iii iv T rj t\ 7\ t t c" t~> t^t i t nT>MVPtJT mcrrr/"* tit-it t y\ t\ rvr 
vAyrTjIKUVf JJJSJ/vl HAAXj I SRRHLiQrWKSHLiKNV QRPLLQAPFL 

PPKPPPP 1 1 WPDPPSSSEECPAHLLEDPLCADVI LVLQERVRI 

FAHKIYIjSTSSSKFYDLFLMDLSEGELGGPSEPGGTHPEDHQGH 

SDQHHHHHHHHHGRDFLLRAAS FDVCES VDEAGGS G PAGLRAST 

SDGILRGNGTGYLPGRGRVLSSWSRAFVSIQEEMAEDPLTYKSR 

LMWVKMDSS I QPGPFRAVLKYLYTGELDENRRDLMHIAH IAEL 

LEVFDLRMMVANILNNEAFMNQEITKAFHVRRTNRVKECLAKGT 

FSDVTFILDDGTI SAHKPLL I SSCDWMAAMFGGPFVES STREW 

FPYTSKSCMRAVLEYLYTGMFTSSPDLDDMKLIILANRLCLPHL 

VALTEQYTVTGLMEATQMMVDIDGDVLVFLELAQFHCAYQLADW 

CLHH I CTNYNNVCRKFPRDMKAMSPENQEY FE KHRW PP VW YLKE 

EDHYQRARKEREKEDYLHLKRQPKRRWLFWNSPSS PSSSAASSS 

SP5SSSAW 


6468 


3 


1374 


DAWAGTNMAALAPVGSPASRGPRLAAGLRI^LPMIXJLJjQLLAEPG 
IiGR VlQnJu^KDDVRHKVHLNTFGFFKI^YMWNVS PED 
KDVTIGFSLDRTKNDGFSSYLDEDVNYCILKKQSVSVTLLILDI 
SRSEVRVKSPPEAGTQLPKI I FSRDEKVLGQSQEPNVNPASAGN 
QTQKTQDGGKS KRS TVDSKAMGEKS FS VHNNGGAVS FQFFFNI S 
TDDQEGLYSL Y FHKCLGKELPSDKFTFS LDI E ITE KNPDS YLS A 
GE I PLP KLYISMAFFFFLSGTI WIH I LR KRRNDVFKI HWLMAAL 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 
ITIALIGTGWAFIKHILSDKDKKIFMIVIPRRVLANVAYIIIES 
TEEGTTE YGLWKDS LFLVDLLCCGAI LFPWWS I RHLQE AS ATD 
SKGKFS RAHPVLLS LL 


6469 


3 


1374 


DAWAGTNMAALAPVGSPASRGPRLAAGLRLLPMLGLLQLLAEPG 
LGRVHRl^ALKIlDVRHKAmiiNTFGFFKDGYMVVNVSSLSLNEPED 
KDVTIGFSLDRTKKDGFSSYLDEDVNYCILKKQSVSVTLLILDI 
SRSEVRVKSPPEAGTQLPKIIFSRDEKVLGQSQEPNVNPASAGN 
QTQKTQDGGKS KRSTVDSKAMGEKSFSVHKNGGAVS FQFFFNIS 
TDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine. 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutaraine, R^Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEIPLPKLYISMAFFFFI^GTIWIHII^KRRNDVFKIHWWIAAI. 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGADLF 
I T I AL I GTGWAF I KH I LS DKDKK I FMI VI PR R VLANVAY 1 1 1 ES 
TEEGTTE YGIjWKDSL FLVDLLCCGAI LFPWWS I RHLQE ASATD 
GKGKFSRAHFVLLSLL 


6470 


2726 


1437 


AAASGVSS RADAPVLAQS PASAGNGRPSTPRVPGSRRH PSAPRS 
G PL PREDG CRT PG PQIjLPtiPGAIi LR PRTLLSS AAETGRS RHP DT 
QH PSSGGRCRGGTES PS SAAGRPASMAEAEEDCHSDTVRADDDE 
ENESPAETDIiQAQLOMFRAQWMFELAPGVSSSWLENRPCRAARG 
SLQKTSADTKGKQEQAKEEKARELFLKAVEEEQNGALYEAIKFY 
RRAMQLVPDIEFKITYTRSPDGDGVGNSYIEDNDDDSKMADLI^S 
YFQQQLTFOESVLKLCQPELESSQIHISVLPMEVLMYIFRWWS 
SDLDLRSLEQLSLVCRGFYICARDPEIWRLACLKVWGRSCIKLV 
P YTS WREMFLER PR VR FDGVY I SKTT YI RQGEQS L»DGF YRAWHQ 
VEYYRYIRFFPDGHVMMLTTPEEPQSIVPRLRTR 


6471 


1750 


239 


FFFDKMAAGGSGVGGKRSSKSDADSGFLGLRPTSVDPALRRRRR 
GPRNKKRGWRRLAQEPLGJbEVDQFLEDVRLQERTSGGLLSEAPN 
EKIjFFVDTGS KEKGLTKKRTKVQKKSLXJjKKPLRVDLI lentsk 
VPAPKDVIAHQVPNAKKLRRKEQLWEKLAKQGELPREVRRAQAR 
LLNPSATRAKPGPQDTVERPFYDLWASDNPLDRPLVGQDEFFIiE 
QTKKKGVKRPARLHTK P S QAPAVE VAPAGAS YNPS FEDHQTLI*S 
AAHEVELQRQKEAE KLERQLALPATEQAATQES TFQEI»CEGLIiE 
ESDGEGEPGQGEGPEAGDAEVCPTPARLATTEKKTEQQRRREKA 
VHRLRVQQAALRAARLRHQELFRLRG I KAQVALR LAELARRQRR 
RQARREAEADKPRRLGRLKYQAPDI DVQbSS ELTDSLRTLKPEG 
NILRDRFKSFQRRNMIEPRERAKFKRKYKVKLVEKRAFREIQL 


6472 


3 


897 


SCGSDRAQWAMEFPFDVDALFPERITVLDQHLRPPARRPGTTTP"" 

ARVDLQQQ I MT I IDELGKASAKAQNLSAPITSASRMQSNRHVVY 

ILKDSSARPAGKGAIIGFIKVGYKKLFVLDDREAHNEVEPLCII* 

DFYIHESVQRHGHGRELFQYMLQKERVEPHQLAIDRPSQKLLKF 

liNKHYNLETTVPQVNNFVI FEGFFAHQHRPPAPSLRATRHSRAA 

AVDPTPAAPARKIiPPKRAEGDIKPYSSSDREFLKVAVEPPWPLN 

RAPRRATPPAHPPPRSSSLGNSPERGPIiRPFVP 


64 73 


22 


912 


S SAVE FVWEGEKMAAEPNKTE I QTLFKRIiRAV PTNKACFDCGAK 
NPSWASITYGVFLCIDCSGVHRSLGVHLSFIRSTELDSNWNWFQ 
LRCMQ VGGNANATAFFRQHGCTANDANTKYNSRAAQM YREK I RQ 
LGS AALARHGTDLWI DKM3S AVPNHS PE KKDSDFFTEHTQP PAW 
DAPATEPSGTQQPAPSTESSGLAQPEHG PNTDLLGTS PKASLEL 
KSSIIGKKKPAAAKKGLGAKKGLGAQKVSSQSFSEIERQAQVAE 
KLREOQAADAKKQAEESMVASMRLAYQELQIDR 


64 74 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSENGETKAEE IHI SRSTVNVSTSRGTP 
PSTI*SVKGQI ETVRVKGTBN 


6475 


3 


4 fip 
** 


LQRQRQHPAAAPAVPVRCPTFCFTDIVIMPKRKSPENTEGKDGS 
iCVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KG KKE E KQEAGKEGTAPS ENGETKAEE IH I SRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTEN 


6476 " 


106 


1090 


ARAMAQYKGTMREAGRAMHLLKKRERQREQMEVLKQRIAEETIL> " 
KSQVDKR FSAH YDAVEAELKSS TVGLVTLNDMKARQEAIjVRERE 
RQIiAKRQHLEEQRLQQERQR EQEQRRERKR KI SCI>S FALDDI>DD 
QADAAEARRAGNLGKNPDVDTS FLPDRDREEEENRIiREELRQEW 
EAQREKVXDE EMEVT FS YWDG SGHRRTVR VR KGNT VQQ FLKKAL 
QGLRKDFLELRSAGVEQLMFIKEDLILPHYHTFYDFIIARARGK 
SGPLFSFDVHDDVRIiLSDATMEKDESHAGKWLRSWYEKNKHIF 
PASRWEAYDPEKKWDKYTIR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N~Asparagine, 
P^Proline, Q-Glutamine, R=Arginine, 
S=Serine, T^Threonine, V«Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6477 


227 


915 


LQGHLMG I MAAS RPLS RFWEWGKN I VCVGRN YADH VREMRS AVL 
SEPVLFLKPSTAYAPEGSPILMPAYTRNLHHELELGVVMGKRCR 
AVP EAAAMD YVGG YALCLDMTARDVQDECKKKGLPWTIiAKS FTA 
SCPVSAFVPKEKIPDPHKLFCLWLKVNGELRQEGETSSMIFSIPY 

I IS YVSK1 ITLEEGDI iltgtpkgvgpvkendeieagihglvsm 
TFKVEKPEY 


6478 


2 


1495 


FVS SRI IiPESLASSEASTLEAMGRKEEDDCSSWKKQTTNIRKTF 
I FMEVTjGSGAFSEVFLVKQRLTGKLFALKCI kkspafrdsslen 
EXAVLKKIKHENIVTLEDIYESTTHYYLVMQLVSGGELFDRILE 
RG V YTE KDAS LVI QQVLS AVKYLH ENGI VHRDLKPENLL YLTPE 
ENSKIMITDFGLSKMEQNGIMSTACGTPGYVAPEVIAQKPYSKA 
VDCWS IGVITYI LLCGYPPFYEETES KLFEKI KEGYYEFESPFW 
DDISESAKDFI CHLLEKDPNBRYTCEKALSHPWI DGNTALHRDI 

Y P Q V Q T .OTO TTN PA K Q Tf UP HA PN & £ AVT7WWMP VT . UMMT XJ <! T>f\rD Ti 

EVENRPPETQASETSRPSS PE ITITEAP VLDHS VALPALTQLPC 
QHGRRPTAPGGRSLNCLVNGSLH I S S SLVPMHQGSLAAG PCGCC 

HCRAGQTGVCLIM 


6479 


1 3 


949 


SCRGPGWHPAGGQAGAMELLSALSLGELALS FSRVPLPPVFDLS 
YFI VSI LYLKYEPGAVELSRR1IPI AS WLCAMLHCFGSY IIJVDLIj 
LGEPLI DYFSNNSS ILLASAVWYLI FFCPLDLF YKCVCFLPVKL 
I FVAMKEWRVRKIAVG IHHAHHHYHHGWFVM I ATGWVKGSGVA 
LMSNFEQLIiRGVWKPETNEILHMSFPTKASLYGAILFTLQQTRW 
LPVSKAS LI FI FTLFMVSCKVFLTATHSH S S PFDALEG YI CPVL 

KKAKKAD 


6480 


192 


514 


DFMSIYFPIHCPZ>YLRSAKMTEVMMNTQPMEEIGIiSPRKDGI*SY 
QIFPDPSDFDRCCKLKDRLPSIWEPTEGEVESGELRWPPEEFL 
VQEDEQDNCEETAXENKEQ 


6481 


110 


1131 


KSRMDLDVVKMFVIAGGTLAI P ILAF VASFLLWPSALIR I YYWY 
WRRTLGMQVR YVHHEDYQFCYS FRGRPGHKPS I LMLHGFS AHKD 
MWLSWKFLPKNLHLVCVDMPGHEGTTRSSLDDLS IDGQVKRIH 
QFVECLKLNKKPFHLVGTSMGGQVAGVYAAYYPSDVSSbWLVCP 
AGLQYSTDNQFVQRLKEIiGGSAAVEKIPLIPSTPEEMSEMLQLC 
SYVRFKVPQQILQGLVDVRIPHNNFYRKLFLEIVSEKSRYSLHQ 
NMDKI KVPTQI I WGKQDQVLDVSGADMLAKS IANCQVELLENCG 
HSWMERPRKTAKLI I DFLASVHNTDNNKKLD 


6482 


2517 


568 


EPVSKVSQSRRJKAGVPTANIEESQAVEAAMANVPWAEVCEKFQA 
ALALSRVELHKNPEKE P YKS KYSARALLEEVKALLGPAPEDEDE 
R PEAEDG PGAGDHALGIiPAE WE PEG P VAQRAVRLAVI EFHIjGV 
NHIDTEELSAGEEHLVKCLRLLRRYRLSHDCISLCIQAQNNLGI 
LWS EREEI ETAQAYLES SEALYNQ YMKEVGS PPLDPTERFLPEE 
EKLTEQBRS KRFEKVYTHNLYYLAQVYQHLEMFEKAAHYCHSTL 
KRQIi EHNAYH P I EWAI NAATLSQFY INKLCFMEARHCLS AANVI 
FGQTGKISATEDTPEAEGEVPELYHQRKGEIARCWIKYCLTLMQ 
NAQLSMQDNIGELDIiDKQSELRALRKKELDEEESIRKKAVQFGT 
GELCDAISAVEEKVSYLRPLDFEEARELFLLGQHYVFEAKEFFQ 
IDG YVTDH I E WQDHS ALFKGLAF FETDMERRCKMHKRR I AMLE 
PLTVDLNPQYYLLVNRQIQFEIAHAYYDMMDLKVAIADRLRDPD 
SHI VKKIKNLNKS ALKY YQDFIiDSLRDPNKVFPEHIGEDVLRPA 
MIAKFRVARLYGKIITADPKKELENLATSLEHYKFIVDYCEKHP 
EAAQEIEVELELSKEMVSLLPTKMERFRTKNALT 


6463 


3 


623 


KSHIJjCXSLRAPJVPI^ANGREARAMEQRIjABFRAARKRAGIiAAQP 
PAA^QGAQTPGEKAEAAATLKAAPGWLKRFLVWKPRPASARAQP 
GLVQEAAQPQGSTSET PWNTAI PLPSCWDQS FLTNITFLKVLLW 
LVLLGLFVELEFGLAYFVLSLFYWMYVGTRGPEEKKEGEKSAYS 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

»—v/x j. c£>pvjricLmg 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
! sequence 


Ammo acxd segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine , 
H^Histidine, I=Isoleucine r K=Lysine, 
I^Leucine, M=Methionine, N=Asparagine r 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6484 


201 


965 


VFNPGCEAIQGTLTAEQLERELQLRPLAGR 

QLAVKTKMSGLRPGTQVDPE 1 ELFVKAGSDGSS IGNCPFCQRLF 
M I I»WL KG VKFNVTT VDMTRKPEE LKDLAPGTNPP FL VYNKE LKT 
DFIKIEEFLEQTLAPPRYPHLSPKYKESFDVGCNLFAKFSAYIK 
NTQKE ANKN FE KSLItKEFKR tiDD YLNTPLLDE I DPDS AE EP P VS 
RRLFLDGDQLTLADCSLLPKLNI I KVAAKKYRDF D I PAE FSG VW 
RYLHNAYAREEFTHTCPEDKE I ENTYANVAKQKS 


6485 
6486 


I 6 


1091 


FVDLVRAVEFLPCPDSQKLEKECQSSEESMGSNSMRS I LEEDEE 
DEEPPRVLLYHBPRSFEVGMLWJHKHKKYPFWPAWKSVRQRDK 
KASVLYIEGHMNPKMKGFTVSLKSLKHFDCKEKQTLLNQAREDF 
NQDIGWCVSLITDYRVRLGCGS FAGS FLE YYAADIS Y PVRKS I Q 
QD VLGTKLPQLS KG S P EEP WG CPLGQRQPCR KML»PDRSRAARD 
RANQKLVEY-GKAKGAESHLRAILKSRKPSRt^LQTFLSSSQYVT 
CVETYIiEDEGQLDLVVKYLQGVYQEVGAKVLQRTNGDRIRFILD 
VIiL PEAI I CA I S AGDEVDYKTAEE KYI KG P S L»S YR E KE I FDNQL 
LEERNRRRR 




10 


581 


LVLQAGGAHiiSPSRVTQGIYYMIAFSEMPKPPDYSELSDSLTLA 

GGTGRFSGPLHRAWRMMNFRQRMGWIGVGLYLIiASAAAFYYVFE 

ISETYNRLALEHIQQHPEEPLEGTTWTHSLKAQLLSLPFWVWTV 

IFLVPYLQMFLFLYSCTRADPKTVGYCIIPICLAVICNRHQAFV 
KASNQISRLQLIDT 


6487 


352 


863 


SFLKPLRGKMSVTLHTDVGDIKIEVFCERTPKTCENFLALCASN 
YYNGCIFHRNI KGFMVQTGDPTGTGRGGKS I WG KKFEDE YS E Y I» 
KHNVRGWSMANNGPNTNGSQFFI TYGKQPHLDMKYTVFGKVI D 
GLETLDELEKLPVNEKTYRPLNDVHIKD1TIHANPFAQ 


6488 


878 


241 


TALQEFGTSGPPLSLRFAIaPSGTGRFKPLFGARGPSWPPSPRVP" 
MEPPNLYPVKLYVYDLSKGIiARRLSPIMI.GKQliEGIWHTSIVVH 
KDEFFFGSGGI SSCPPGGTLLGPPDS WDVGSTE VTEE I FLE YI> 
SSLGESLFRGEAYNLFEHNCNTFSNE VAQFLTGRKI PS Y I TDLP 
SEVLSTPFGQALRPbLDS IQIQP PGGSS VGRPNGQS 


6489 


1457 


375 


KVAIQMATAI^KEELDNEDYYSIXNVRRE^SEELKAAYRRI>CML 
YHPDKHRDPBLKSQAERLFNIjVHQAYEVLSDPQTRAIYDIYGKR 
GLEMEGWEWERRRTPAEIREEFERLQREREERRLQQRTNPKGT 
I S VG VDATDLFDR YDEE YEDVSGSS FPQ I E INKMH I SQS I EAPL 
TATDTAILSGSLSTQNGNGGGSINFALRRVTSAKGWGEI.EFGAG 
DLQGPIiFGLKLFRNIiTPRCFVTTNCAliQFSSRGI RPGLTTVIiAR 
NliDKNTVGYLQWHCSSPLLQVQRPHRNTRACAPEPS FRPFLHVP 

ll^DAECSGARTPSTAWTSAAVKLREACLSGPGSGSHQLLLliTPR 
SKRRTGGG 


6490 
6491 


3 


1183 


HEAGCEVWLGYGPRAAAAAAATVLFGGAGPTETMFVARSIAADH " 
KDLIHDVSFDFKGRRMATCSSDQSVKVT-gDKSESGDWHCTASWKT 
HSGSVWRVTWAHPEFGQVLAS CSFDRTAAVWEE I VGESNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGLMLATCSADGIVRIYE 

DSSPNAMAKVQIFEYNENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFHI LAI ATKDVRI FTLKPVRKELTSSGGPTKFE IHI VAQFD 
NHNS QVWR VSWN I TGTVLAS SGDDGCVRLWKANYMDNWKCTG I L§ 
KGNGSPVNGSSQQGTSNPSLGSNIPSLQWSLNGSSAGRKHS 




3 


1183 


HEAGCE VWLG YG PRAAAAAAATVtiFGGAGPTETMFVARS I AADH 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASWKT 
HSGS WJR VTWAH PE FGQVIAS CS FDRTAAVWEE I VGES NDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGLMIATCSADGIVRIYE 
APDVMNLSQWSLQHEISCKLSCSCISWNPSSSRAHSPMIAVGSD 
DSSPNAMAKVQIFEYNENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRS Fill I*A I ATKDVRI FTLKP VR KELTS SGGPTKFE I H I VAQFD 
NHNSQVWRVS WNITGTVLAS SGDDGCVRLWKANYMDNl'JKCTGI Ii 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c or re sp ondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Slycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine , M^Methionine , N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrocine, X^Unknown, +=Stop 1 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KGNGSPVNGSSQQGTSNPSLGSNI PSLQNSLNGSSAGRKHS 


6492 


34 


2573 


IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVT3HGTPKPFRK 
FDSVAFGESQSEDEQFENDLETDPPNWQQLVSREVLLGLKPCEI 
KRQEVINELFYTERAHVRTLKVLDQVFYQRVSREGILSPSELRK 
I FSNLEDI LQLHI GLNEQMKAVRKRNETSVI DQIGEDLLTWFSG 
PGEEKLKHAAATFCSNQPFALEMIKSRQKKDSRFQTFVQDAESN 
PLCRRLQLKDIIPTQMQRLTKYPLLLDNIATYTEWPTEREKVKK 
AADHCRQI LNYVNQAVKEAENKQRLEDYQRRLDTS S LKLS EY PN 
VEELRNLDLTKRKMIHEGPLW1CVNRDKTIDLYTLLLEDILVLL 
QKQDDRLVLRCHS KI LASTADSKHTFS P VI KLSTVLVRQVATDN 
KALFVI SMSDNGAQI YELVAQTVSEKTVWQDLI CRMAASVKEQS 
TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTGLQSPDRDLG 
LESTLISSKPQSHSLSTSGKSEVRDLFVAERQFAKEQHTDGTLK 
EVGEDYQIAIPDSHLPVSEERWALDALRNLGLLKQLLVQQLGLT 
EKS VQEDWQH FPRYRTAS QG PQTDS V IQNSEN I KAYHSGEGHMP 
FRTGTGDI ATC YS PRTSTES FAPRDS VGLAPQDS QASNI L VMDH 
MIMTPEMPTMEPEGGLDDSGEHFFDAREAHSDENPSEGDGAVNK 
EEKD VNLRI SGNYL I LDGYD P VQES S TDEEVASS LTLQPMTG I P 
AVESTHQQQH SPQNTHSDGA I S PFTPEFLVQQRWGAME YSCFE I 
QSPSS CADSQSQIMEYIHK IEADLEHLKKVEES YT I LCQRLAGS 
ALTDKHSDKS 


6493 


557 


1147 


TPARMAYQGS S TSDCMSKTLDS ASAHFAAS A WSAP VPSRS EVA 
KEQNTGH2JNING WQPSGTSKTLYS TNMALS SS PG I S AVQLVRX 
VGHTTTNHLI PALCTSSPQTLPMNNSCLTNAVHLNNVS WS PVN 
VHINTRTSAPSPTALKLATVAASMDRVPKVTPSSAISSIARENH 
EPERLGLNGIAETTVAMEVT 


6494 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAIMSASAVYVLD 
LKGKVLICRNYRGDVDMSEVEHFMPIIJ^EKEEBG^SPILAHGG 
VRFMWI KHNNL YLVATS K KNAC VS L VFS FLY KWQ V F S E Y F KE L 
EEES IRDNFVI I YELLDELMDFGYPQTTDSKILQEYITQEGHKL 
ETGAPR PPATVTWAVSWRS EG I KYRKNEVFLDVI BSVNLLVSAN 
GNVLRS E I VGSI KMRVFLSGMPELRLGLNDKVLFDNTGRGKSKS 
VELJBDVKFHQCVRI^RFENDRTISFIPPDGEFELMSYRIiNTKVK 
PLI W I E S VI EKHSHSR I E YM I KAKSQFKRRSTANNVE I HIP VPN 
DADSPKFKTTVGSVKWVPENSEIVWS I KSFPGGKE YLMRAHFGL 
PS VEAEDKEGKPPIS VKFEIPYFTTSGIOVRYLKI IEKSGYQAL 
PWVRYITQNGDYQLRTQ 


649S 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAIKSASAVYVLD 
LKGKVL I CRNYRGDVDMSEVEHFMP I LMEKEEEGKLS P I LAHGG 
VRFWWI KHNNLYLVATSKKNACVSL VPS FL YKVVQVFSEYFKEL 
EEES IRDNFVI I YELLDELMDFGYPQTTDS KILQEY I TQEGHKL 
ETGAPRP PATVTNAVS WRSEG I KYRKNEVFLDVI ES VNLLVSAN 
GNVLRSEIVGSIKMRVFLSGMPELRLGLNDKVLFDNTGRGKSKS 
VELEDVKFHQ C VRLSRFENDRT I S F IPPDGE FELMSYR LNTHVK 
PLI WIES VIEKHSHSRIE YMI KAKS QFKRRSTANNVEIHI PVPN 
DADSPKFKTTVGSVKWVPENSEIWSIKSFPGGKEYLMRAHFGL 
PSVEABDKEGKPPI SVKFEIPYFTTSGIQVRYLKI IEKSGYQAL 
PWVRYITQNGDYQLRTQ 


6496 


247 


S59 


LRAVSLLPLQLVLPEYSIHSLFCIMFIXAQEWLTLGLNVPLLFY - " 
HFWRYFHCPADSSELAYDPPWMNADTLSYCQKEAWCKLAFYLL 
SFF Y YLYCMI YTLVSS 


6497 


1053 


352 


ANTQICRLCPRRHLHPPCGAKMGNGTEEDYNFVFKWLIGESGV" 
GKTNLLSRFTRNEFSHDS RTT IGVEFSTRTVMLGTAAVKAQ I WD 
TAGLERYRAITSAYYRGAVGALLVFDLTKHQTYAWERWLKELY 
DHAEATIVVMLVGNKSDLSQAREVPTEEARMFAENNGLLFLETS 
ALDSTNVEIAFETVLKEIFAKVSKQRQNSIRTNAITLGSAQAGQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, >J=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine f V=Valine, 
W=Tryptophan, Y= Tyrosine , X=Dnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EPGPGEKRACCISL 


6498 


2636 


272 


S LRLCPWGTHbAGPTTMRLSSLLALI,RPAIiPLI LGLS LGCSLSL 
LRVSWIOGEGEDPCVEAVGERGGPQMPDSRARLDQSDEDFKPRI 
VPYYRDPNKPYKKVLRTRYIQTELGSRERLLVAVLTSRATLSTL 
AVAVNRTVAHHFPRLLYFTGQRGARAPAGMQWSHGDERPAWLM 
S ETLRHLHTHFGAD YDWFFI MQDDTYVQAPRIiAAXiAG HLS INQD 
LYLGRAEEFIGAGEQARYCHGGFGYL»LSRSLIiLRIiRPHLDGCRG 
DI LSARPDEWLGRCI>I DSLGVGCVSQHQGQQYRS FELAKNRDPE 
KEGS S AFLS AFAVHPVS EGTLMYRLHKRFS ALELERA YS E I EQL 
QAQIRNLTVLTPEGEAGLS WPVGL P APFTPHSR FEVLGWDYFTE 
QHTFSCMGAPKCPLOXSASRADVGDAJJETALEQLNRRYQPRIJiF 
QKQRLLNGYRRFDPARGMEYTLDLLLECVTQRGHRRAIARRVSL 
LRPLSRVE I LPMP YVTEATRVQLVLPI>LVAEAAAAPAFLEAFAA 
NVLEPREHAHiTLLLVYG PREGGRGAPDPFI»GVKAAAAEL»ERRY 
PCTRLAWLAVRAEAPSQVRLMDVVSKKHPVDTLFFLTTVWTRPG 
PEVLNRCRMNAI SGWQAFFP VHFQEFNPALSPQRS PPGPPGAGP 
DPPSPPGADPSRGAPIGGRFDRQASAEGCFYNADYLAARARLAG 
ELAGQEEEEALEGLEVMDVFLRFSGLHLFRAVEPGLVQKFSLRD 
CS PRLS EELYHRCRLSNLEGLGGRAQLAMALFEQEQAKST 


6499 


3 


2040 


SCSADTRPSGQAWPTVGLRAAAGAFRTGSPLALGPETPQVACLP "" 
GHPPVRPQVSGGPGAMPDPAAHLPFFYGSISRAEAEEHLKIiAGM 
ADGLFLLRQCLRSLGGYVLSLVHDVRFHHFPIERQIjNGTYAIAG 
GKAHCG P AELCE F YSRD PDGL PCNLRKPCNR PSGLEPO PGVFDC 
IjRDAMVRDYVRQTWKLEGEAIjEQAIISQAPQVEKIjIATTAHERM 
PWYHSSLrREEAERKLYSGAQTDGKFIiIjRPRKEQGTYALSLIYG 
KTVYHYTiISQDKAGKYCIPEGTKFDTLWQLVEYLKCKADGLIYC 
L> K E AC PNS SASN AS G AAAPTI* PAH PS TIiTHPQRR I DTLN S DGYT 
PEPARITS PD KPR PMPMDTS VYES P YS DP EELKDKKX.FLKRDNL 
LXAD I ELGCGNFGS VRQGVYRMRKKQ I DVAI KVLKQGTEKADTE 
EMMREAQIMHQLDNPYIVRLIGVCQAEAIjMLVMEMAGGGPLHKF 
LVGKREE I PVSNVAELLHQVS MGMKYI*EEKNFVHRDLAARNVIiti 
VNRHYAKISDFGLS KALGADDSYYTARSAGK^PLKN YAPECINF 
RKFSSRSDVWSYGVTMWEAIjSYGQKPYTCKMKGPEVMAFIEQGKR 
MECPPECPPEIiYALMSDCW I YKWEDRPDFLTVEQRMRACYYS LA 
SKVEGPPGSTQKAEAACA 


6500 


1773 


726 


TGPTHASADAWGLVRSVTEV7CANVRGWPCAAALSCPQAVLDAGK 
MLSESS S FLKG VMLGSI FCAL I TMLGHI R IGHGNRMHHHEHHHIj 
QAPNKED I LKI SEDERMEIjSKS FRVYC 1 1 LVKPKDVSLWAAVKE 
TWTKHCDKAEFFSSENVKVFES INMDTNDMWLMMRKAYKYAFDK 
YRDQYNWFFLARPTTFAI I ENLKYFLLKKDPSQPFYLGHTI KSG 
DLEYVGMEGGIVLSV3SMKRLNS LLN I PEKCPEQGGM IWKISED 
KQLAVCLKYAGVFAENAEDADGKDVFNTKSVGLS I KEAMTYHPN 
QWEGCCSDMAVTFNGLTPNQMHVMMYGVYRLRAFGPYFQ 


65*01 


1 


570 


LVGMS GGGTETP VGCEAAPGGG S KKRDSLGTAGS AHL 1 1 KDLGE 

ihsriildhrp vtqgetr yfvkefeekrglremr vlenl knm i he 
tnehtlpkcrdtmrdslsqvlqrlqaandsvcrlqqreqerkki 
hsdhlvasekqhmlqwdnfmkeqpnkraevdeekrkarbrlkeq 

YAEMEKDLAKFSTF j 


6502 


213 


1650 


AGNKPDP WAGRNRTAVLPDVS VFHREDVGWW RSWLQQ S YQAV KE ' 
KSS EALE FMKRDLTE FTQWQHDTACT I AATAS WKE KLATEGS 
SGATEKMKKGLSDFLGVISDTFAPSPDKTIDCDVITX.MGTPSGT 
AEPYDGTKARLYSLQSDPATYCNEPDGPPEliFDAWIiSQFCLEEK 
KGEISELLVGSPS IRAbYTKMVPAAVSHSEFWHRYFYKVHQLEQ . 
EQARRDALKQRAEQS I SEEPGWEEEBEELMGISPI S PKEAKVP V 
AKI STFPEGEPGPQS PCEENLVTS VEPPAEVTPSESSES ISLVT 
QIANPATAPEARVLPKDLSQKLLEASLEEQGIiAVDVGETGPSPP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, K=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *~Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IHSKPLTPAGHTGGPEPRPPARVETIjREEAPTDLRVFELNSDSG 
KSTPSNNGKKGSSTDISEDWEKDFDLDMTEEEVQMALSKVDASG 
EVSGPGGSEGS EPNGPGCESSPQPAQLS PQEGPCSCLR 


6503 


213 


1650 


AGN K PD P WAGRNRTAVLPDVSVFH REDVG VJW RS WLQQS YQAVKE 
KSS E ALE FMKRDLTEFTQVVQHDTACT I AATAS WKE KLATEGS 
SGATEKMKKGLSDFLGVISDTFAPSPDKTIDCDVITLMGTPSGT 
AEPYDGTKARLYSLQSDPATYCNEPDGPPEbFDAWLSQFCIiEEK 
KGE 1 SELLVGS PS IRALYTKMVPAAVSHSEFWHRYF YKVHQLEQ 
EQARRDAIiKQRAEQSISEEPGWEEEEEELMGISPISPKEAKVPV 
AKISTFFEGEPGPQSPCF.ENI>VTSVEPPAEVTPSESSESISLVT 
Q I ANPATAPEARVI>PKDLSQKLLEASLEEQGLAVDVGETG PSPP 
IHS KPLT PAGHTGGPE PRP PARVETI>REEAPTDLRVFELNSDSG 
KSTPSNNGKKGSSTDISEDWEKDFDUDMTBEEVQMALSKVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6504 


2131 


1294 


GKVCIjVAHWVCLS ILS PPPAGMKTPNAQEAEGQQTRAAAGRATG 
SANMTKKKVSQKKQRGRPSSQPCRNIVGCRISHGWKEGDEPXTQ 
WKGTVLDQVP I N PSbYLVK YDG I DCVYGLELHRDERVLSLKI I»S 
DRVAS SHISDANLANTI IGKAVEHMFEGEHGS KDEWRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQIJLjDDYKEGDIiRIMPESSESPPTE 
REPGGWDGLIGKHVEYTKEDGSKRIGMVIHQVEAKPSVYFIKF 
DDDFHIYVYDLVKKS 


6505 


2131 


1294 


GKVCLiVAHWVCLS ILS P P P AGMKTP WAQEAEGQQTRAAAGRATG 
SANMTKKKVSQKKQRGRPSSOPCRWI VGCR I SHGWKEGDEP ITQ 
WKGTVLDQVPINPSLYLVKYDGIDCVYGJbELHRDERVLSLKILS 
DRVASSHISDANbANTI IGKAVEHMFEGEHGS KDEWRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 
REPGGVVDGL1GKHVEYTKEDGSKRIGMVIHQVEAKPSVYFIKF 
DDDFH I YVYDLVKKS 


6506 


1 


1350 


E VS P PTS CCI»T VAVADPG VS EGFRG FGAGCEMPGRGRCPDCGS T 
ELVEDSHYSQSQLVCSDCGCWTEGVliTTTFSDEGNLREVTYSR 
STGENEQVSRSQQRGLRRVRDL.CRVML.PPTFEDTAVAYYQQAY 
RHSG IRAARLQKKEVIiVGCCVlilTCRQHNWPIiTMGAI CTIiL YAD 
LDVFSSTYMQIVIOiLGI^VPSLCLAEl.VKTYCSSFKL.FQASPSV 
PAKYVEDKEKMLSRTMQLVELANETWLVTGRHPLPVI TAATFLA 
WQSLQPADRLS CS IjARFCKLANVDLP YPASSRIiQELLAVIiLRMA 
EQLAWLRV1.RLDKRSWKHIGDL.LQHRQSLVRSAFRDGTABVET 
REKEPPGWGQGQGEGSVGNNSLGLPQGKRPASPAIiLLPPCMLKS 
PKRICPVPPVSTVTGDENISDSEIEQYI*RTPQEVRDFQRAQAAR 
QAATSVPNPP 


6507 


1878 


929 


RSHASRL.PE LPSGCLVLQVQELVQMS GMEATVT I P I WQNKPHGA 
ARSVVRRIGTNLPLKPCARASFETLPNISDLCIiRDVPPVPTLAD 
IAWIAADEEETYARVRS DTR PLRHTWKPS PLI VMQRNASVPNLR 
GS EERLLALKKPALPALSRTTEliQDELSHl.RSQIAKI VAADAAS 
ASLTPDFIjS PGSSNVSSPLPCFGSSFHSTTS FVISDI TEETEVE 
VP ELPS VPLLCS ASPECCKPEHKAACS S S E EDDCVS hS KASS FA 
DMMGILKDFHRMKQSQDLNRSL.LKEEDPAVLISEVLRRKFALKE 
EDISRKGN 1 


6508 


862 


342 


WEARKRPQRWPSERREVRVPPPHIiQRGRSGLEPGTPRKMAAARP 
S IjGRVLPGS S VLFLCDMQEKFRHN IAYF PQIVSVAARMLKNTTL 
DLLDRGLQVHWVDACSSRSQVDRLVALARMRQSGAFLSTSEGL 
ILQLVGDAVHPQFKEIQKLIKEPAPDSGLLGLFQGQNS1,I,H 


6509 


2 


1053 


F VWN PRGGRKRRRQAAVTQAATRASGTPS PRDGTMTQGKLS VAN 
KAPGTEGQQQVHGEKKBAPAVPSAPPSYEEATSGEGMKAGAFPP 
APTAVPLHPSWAYVDPSSSSSYDNGFPTGDHELFTTFSWDDQKV 
RRVFVRKVYT I LLIQI»I*VTIAVVALFTFCDP VKDYVQAN PGWYW 
ASYAVFFATYLTLjACCSGPRRHFPWNLILLTVFTLSMAYIjTGMXi 



514 



BNSDOCIO. <WO 0153312A1J_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine , V=sValine, 
W= Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 






I 


SS YYNTTS VLLCLGI TALVCLS VTVFS FQTKFDFTSCQGVLFVL 
LMTIiFFSGLILAILLPFQYVPWLHAVYAALGAGVFTLFLALDTQ 
LLMGNRRHS LSPEE Y I FGALNT YLDI I YI FTFFLQIiFGTNRE 


6510 


37 


1156 


PCALDGCPQRGAVHPIjLSSAMGLIaAFLKTQFVLHLLVGFVFWS 
GLVINFVQLCTLALWPVSKQLYRRLNCRLAYSLWSQLVMLLEWW 
SCTECTLFTDQATVERFGKBHAVIILNHNFEIDFLCGWTMCERF 
GVIjGSSKVIjAKKE Lli YVPLI G WTWYFLE I VFCKRK WEEDRDT W 
EGLRRLSDYPEYMWFLLYCEGTRFTETKHRVSMEVAAAKGLPVL 
KYHLLPRTKGFTTAVKCLRGTVAAVYDVTLNFRGNKNPSLLGIL 
YGKKYEADMCVRRFPLEDI PLDEKEAAQWLHKLYQEKDAJLQE I Y 
NQKGMFPGEQFKPARRPWTLLNFIjSWATI LLSPLFS FVLGVFAS 
GS PLLILTFLGFVGAGNGHCR 


6511 


2541 


1425 


GEEQPLAAAPTECLEQVIGGAGDPGTWAS FPS PLPGPAPLKGGK 
TMATNFSDIVKQGYVKMKSRKLGIYRRCWLVFRKSSSKGPQRLE 
KYPD3KS VCLRGCPKVTEI SNVKCVTRLPKETKRQAVA 1 1 FTDD 
; SART FTCDSELEAEE W YKTLS VECLGSRLND I S LGE PDkLAPGV 
QCEQTDR FNVFLLPCPNLDVYGE CKLQI THEN I YLWD IHNPR VK 
LVSWPLCSLRRYGRDATRFTFEAGRMCDAGEGLYTFQTQEGEQI 
YQRVHSATLA I AEQEKRVLLEMEKNVRXLNKGTEHYS YPCTPTT 
MI>PRSAYWHH1TGSQNIAEASSYAGEGYGAAQASSETDLLNRFI 
LLKPKPSQGDSSEAKTPSQ 


6512 


159 


807 


FGKKSrWFPI^SRSLRVASGRSCKLGHGGYTGSGPGFGEPRDSGA" 
EVPSGSGRATGCERGGVRGARQGRAPGSSIWRKEPRMVCTRKTK 
TL VSTC VI LSGMTNI I CLLYVGWVTNYIASVYVRGQEPAPDKKL 
EEDKGDTLKI I ERLDHLENVI KQHXQEAPAKPEEAEAEPFTDSS 
LFAHWGQELSPEGRRVALKQFQYYGYNAYLSDRLPLDRP 


6513 ■ 


2 


756 


FVS PE PGFSIAQLNLIWQLTDTKQLVHSFAEGQDOGSAYANRTA 
LFPDIjLAQGNASIjRLQRVRVADEGS FTCFVS I RDFGSAAVS IjQ V 
AAPYS KPSMTLEPNKDIjRPG DTVT I TCSSYQGYPEAEVFWQDGQ 
GVPLTGNVTTSOMANEQGLFDVHSIIjRWLGANGTYSCLVRNPV 
LQQDAHSSVTITPQRS PTG AVEVQVP EDPWALVGTDATLRCS F 
SPE PG FS LAQLNL I WQLTDTKQLVHS FAEGQDQGSAYANRTALF 
PDLLAQGNAS LRL.QRVR VADEGS FTCFVS IRDFGSAAVSLQVAA 
PYS KPSMTLEPNKDLRPGDTVTITCSS yqgy PEAEVFWQDGQGV 
PLTGNVTTSQMANEQGLFDVHS I LRVVLGANGTYSCLVRN PVLQ 
QDAHSSVTITPQRSPTGAVEVQVPEDPWALVGTDATLRCSFSP 
EPGFSliAQIiNLlWQLTDTRQLVHSFTEGR 


6514 


985 


302 


VGI PGPT I SSAAEMEDLLDIOEELR YSLATSRAKMGRRAQQESA 
QAENHLNGKNSS LTTjTGETSSAKLPRCRQGGWAGDS VKAS KFRR 
KASEEI EDFRLRPQSLNGSDYGGDI P 1 1 PDLEEVQE EDFVLQVA 
APPSIQI KRVMT YRDLDNDLMKYSAIQTLDGE I DL KLLTKVLAP 

EHEVRERNPSWQDDVGWDWDHIiFTEVSSEVLTEWDPLQTEKEDP 
AGQARHT 


6S15 


1345 


305 


GRVGS RRRGAAVPGGCGAGSTQIiEVSASASCGALGS ADMNP 1 W 
VHGGGAGPI SKDRKERVHQGMVRAATVG YGILREGGSAVDAVEG 
AWALEDDPEFNAGCGSVLNTNGEVEMDAS imdgkdls agavs a 
VQCIANPI KTiARLVMEKTPHCFUTDQGAAQFAAAMGVPEI PGEK 
LVTERNKKRLEKEKHSKGAQKTDCQKNLGTVGAVALDCKGNVAY 
AT3TGG I VNKMVGR VGDS P CLG AGG YADND I GA VS TTGHGE S I b 
KVNIiARLTLFHIEQGKTVEEAADLSLGYMKSRVKGLGGLIVVS K 
TGDWVAKWTSTSMPWAAAKDGKLHFG I DPDDTTITDLP 


6516 


1 


1402 


FRRLRYLGQDATAAARDLRTRGLQGYCPSATARQQVLVSAI.QQI, 
KGRRSEHRNENQEMPYSTNKEL I LGI MVGTAGI S LLLLWYHKVR 
KPGI AMKLPEFLSLGNTFNS ITLQDE IHDDQ.GTTVI FQERQLQ I 
LEKLNELLTT^MESIiKEEIRFLKEAIPKLEEYIQDEMGKITVHK 
ISPQHRARKRRLPTIQSSATSNSSEEAESEGGYITANTDTEEQS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N^Asparagine, 
P=Proline, Q=Glutamine, Rr=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X -Unknown , *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








fpvpkafntrveelnldvllqkothlrmsesgksesfeiXrdhi^ 
exfrde i efmwrfaraygdmyelstntqekkhyanigktls era 

INRAPMNGHCHLWYAVLCGYVSEFEGLQNKINYGHLFKEHLDIA 
I KLLPE E PFLYYLKGR YC YTVS KLS W I E KKMAATL»FG KI F S S TV 
QEALiHNFl»KAEELCPGYSNPNYMYTiAK'r , YTDT,FT7NriM!iT i/ur-nr 
ALLLPTVTKEDKEAQKEMQKIMTSLKR 


6517 
6518 


3 


1414 


GRVWGGS S SLNAMVYVRGHAED YERWQRQGARGWDYAHCLP YFR " 
KAQGHELGASRYRG ADGP LR VS RGKTNH P LH CAFLEATQQAG YP 
LTEDMNGFQQEG FGWMDMT I HEGKRWSAACAYLHPALS RTNLKA 
EAETLVSRVLFEGTRAVGVEYVKNGQSHRAYASKEVILSGGAIN 
S PQLLMLSG IGNADDLKKLG I PWCHLPGVGQNLQDHLE IYIQQ 
± - L -»«o*kyi\.f LKA.VL itofJjt.WL-WKF TGEGATAHLETGGFIR 
SQPGVPHPDI QFHFLPSQVI DHGRVPTQQEAYQVHVGPMRGTS V 
GWIiKLRSANPQDHPVIQPNYLSTETDIEDFRLCVKI»TREIFAQE 
AIAPFRGKELQPGSHIQSDKEIDAFVRAKADSAYHPSCTCKMGQ 
P SDPTAWDPQTRVLG VENIjRWDAS IMPSMVSGNLNAPTIM I A 
EKAADI I KGQPALWDKDVPVYKPRTLATQR 




242 


1098 


PAWN PGS E PRTRVR PRARS FPLP PPRAPRRRRHRLLRAVPGPSR 
RHRCRRRAPP PPSTMGDAGS ERSKAPSLPPRCPCGFWGSSKTKN 
LCS KCFADFQKKQPDDDS APSTSNSQSDLFSEETTSDNNNTS IT 
*- c *■ K^royyf ±jf l £,J_iN v I ^l^KEECGPCTDTAHVSLITPTXRSC 
GTDSQSENEAS P VKRPRLLENTERS EETSRS KQKSRRRC FQ CQT 

KLELVQQEIiGSCRCGYVFCMLHRLPEQHDCTFDHMGRGREEAIM 
KMVKIjDRKVGRS CQRI GEGCS 


| 6519 


3 


1113 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS 
AKK VRTEEXKAPRRVNGEGGSGGNSRQLQPPAAPS PQS YGS PAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLI, 
VPPTLLHAQPHHLLLPAAAAAASANAKSRRPKEKREKERRRHGL 
GGAREAGGASREENGEVKPLPRDKIKDKIKERDKEKEREKKKHK 
1 — ivi^wan v ijijJ\objyiivFKlTII EDLQI KKVKKKKKKKHK 
ENEKRKRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNIKDYVG 
K1^DTKNYDSKIPENSEFPFVSI>KEPRVQNNLKRLDTLBFKQI»I 
HI EHQPNGGAS VTHCLQ 


6520 


3 


1113 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS 
AKKVRTEEKKAPRRVNGEGG SGGNS RQLQPPAAPS PQS YGS PAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKIiIi 
VPPTLLHAQPHHLLLPAAAAAASANAKSRRPKEKREKERRRHGL 
GGAREAGGASREENGEVKPLPRDKIKDKIKERDKEKEREKKKHK 
VMNEIKKENGEVKIIiLKSGKEKPKTNIEDLQIKKVKKKKKKKHK 
ENEKRKRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNIKDYVG 
KNLDTKNYDSKIPENSEFPFVSLKEPRVQNNLKRLDTLEFKQLI 
HI EHQ PNGGASV I HCLQ 


6521 
6522 


184 
1042 


1798 
391 


KLFKKATDTSQGELVTIPKALPLIVGAQLIHADKLGEKVSDSTMP 
I RRTVNSTR ETPPKS KJaAEGEEE KPE PD I S S EES VSTVEEQENE 
TP P ATSS EAEQPKGEPENE E KEENKS S EETXKDEKDQS KEKEKK 
VKXTIPSWATLSASQIARAQKQTPMASSPRPKMDAILTEAIKAC 
FQKSGAS VVAIRKY I IHKYPSLELERRGYLLKQALKRELNRGVI 
KQVKGKGASGSFVWQKSRKTPQKSRNRKNRSSAVDPEPQVKLE 
DVLPLAFTRLCEPKEAS YS L IRK YVSQ Y YPKLR VD I RPQLLKNA 
I^RAVERGQLEQITGKGASGTFOLKKSGEKPLLGGSLMEYAILS 
AIAAMNBPKTCSTTAI*KKYVIjEWHPGTNSNYQMHLLKKTIiQKCB 
KNGWMEQISGKGFSGTFQLCFPYYPSPGVLFPKKEPDDSRDEDE 
OEDESSEEDSEDEEPPPKRRLQKKTPAKSPGKAASVKQRGSKPA 
PKVSAAQRGKARPLPKKAPPKAKTPAKKTRPSSTVIKKPSGGSS 
KKPATSARKE 

NKWLRPSPRSHRTPESGRVLSLFRLPPPGMALSGSTPAPCWEED 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^ Phenyl alanine, G^Glycine, 
H=Histidine, 2>Isoleucine, K-Lysine, 
L=Leiicine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V^Valine, 
w=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECLDYYGMLSI^RMFEWGGQLTECELELLAFLLDEAPGAAGGL 
S RARSGIi KbLLELERRGO^TDE SI^RLLGQLLRVIiARHDLLPHIiA 
RKRRRPVSPERYSYGTSSSSKRTEGSCRRRRQSSSSANSQQGSP 
PTKRQRRSRGRPSGGARRRRRGPQPHPSSSQSPPDLPLKAK 


6523 


2 


1097 


ASCQTRRRTAALDSGERIAGRRSPIALAMASNFNDIVKQGYVKI " 
RSRKLGI FRRCWLVFKKASSKGPRRLEKFPDEKAAYFRNFHKVT 
ELHNI KN I TRLPRETKKHAVAI I FHDETS KTFACESELEAEEWC 
ICHLCMECLGTRLNDISLGEPDLIiAAGVQREQNERFNVYLMPTPN 
LDIYGECTMQITHENIYLWDIHNAKVKLVMWPLSSLRRYGRDST 
V7FTFESGRMCDTGEGLFTFQTREGEMI YOKVHSATLAT apnwwT? 
LMLEMEQ KARLQTSLTEPMTLS KS I SLPRS AYWHH I TRQNS VGE 

IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 


6524 


2 


1097 


AS CQTRRRTAALDSGER I AGRRS P IALAMASNFNDI VKQG YVKI 
RSRKIiGIFRRCWLVFKKASSKGPRRIiEKFPDEKAAYFRNFHKVT 
ELHNI KNI TRLPRETKKHAVAI I FHDETSKTFACB S ELEAEEWC 
KHLCMECLGTRLND I SLGEPDLLAAGVQREQNERFNVYLMPTPN 
LDIYGECTMQITHENIYLWDIHNAKVKLVMWPLSSLRRYGRDST 
WFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
IiMLEMEQKARLQTS LTE PMTLS KS I S LPRSAYWHH I TRQNS VGE 
I YSLQGNHENRHSDLTGKSCKTS ENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 


6525 


1 


1859 


GESPFSEEES I EFNPS S SGRSARTVSSNS FCSDDTG WPS SQSVS " 

PVKTPSDAGNSPIGFCPGSDEGFTRKKCTIGMVGEGSIQSSRYK 
KESKSGLVKPGSEADFSSSSSTfSQ TQaPKOTMCTarcvuoooon 

NRGPHGRSNGASSHKPGSSPSSPREKDLLSMLCRNQLSPVNIHP 
SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGENHGVRPPNP 
EQ YLT PLQQ KEVTVRHLKTKLKE S ERRLHERES EI VELKSQLAR 
MREDWIEEECHRVEAQLALKEARKEI KQLKQ VI ETMRSS LADKD 
KGIQKYFVDINIQNKKLESLIjQSMEMAHSGSI^RDELCLDFPCDS 
PEKSLTLNP PLDTMADGLSLEEQVTGEGADR ELLVGDS I ANSTD 
LFDE I VTATTTESGDLELVHSTPGANVLELLP I VMGQEEGS VW 
ERAVQTDWPYS PAISELIQSVLQKLQDPCPSSLASPDES EPDS 
MES FPESLSAL WDLTPRNPNSAI LLS PVETPYANVDAEVHANR 
LMRELD FAACVEERLDGV I PLARGGVVRQY WSSS FLVDLLAVAA 
P WPTVLWAFS TQRGGTDP VYNIGALLRGCCVVALHSLRRTAFR 
IKT 


6526 


2 


2034 


SGRAGEPEE WRGRQ 1 1 DS KETW I P FNSEDSQQLEEAYSSGKGCN 
GRWPTDGGRYDVHLGERMRYAVYWDELASEVRRCTWFYKGDKD 
NKYVP YSES FS QVLEETYMLAVTLDEWKKKLES PNRE II ILHNP 
KLMVHYQPVAGSDDWGSTPMEO^RPRTVKRGVENISVDIHCGEP 
IiQIDHLVFVVHGIGPACDLRFRSIVQCWDFRSVSLNLLQTHFK 
KAQENQQ IGRVE FLPVNWHS PLHS TGVDVDLQRI TLPS I NRLRH 
FTNDT I LDVFF YNS PTYCQTI VDTVAS EMNR I YTLFLQRNPDFK 
GGVS IAGHSLGS LI LFDI LTNQKDSLGDIDSEKGSLNI VMDQGD 
TPTLEEDLKKLQLSEFFDI FEKEKVDKEALALCTDRDLQB IG IP 
LGPRKKILNYFSTRKNSMGIKRPAPQPASGANIPKESEFCSSSN 
TRNGD YLD VG I GQVS VKY PRL I YKPE I FFAFGS P IGM FLTVRGL 
KRIDPNYRFPTCKGFFNIYHPFDPVAYRIEPMWPGVEFEPMLI 
PHHKGRIO^MHLELREGLTRMSMDLKNNLLGSLRMAWKSFTRAPY 
PALQASETPEETEAEPESTSEKPSDVNTEETSVAVKEE VLP I NV 
GMLNGGQR I DYVLQEKP I ES FNEYLFALQSHLCYWESEDTVLLV 
LKEIYQTQGIFLDQPLQ 


6527 " 


1 


922 


GWVPLLSRILPSDACKIYKQGINIRLDTTLIDFTDMKCQRGDLS 
FIFNGDAAPSESFVVLDNEQKVYQRIHHEESEMETEEEVDILMS 
SDI YSATLSTKS I S FTRAQTG WLFREDKTER VGNFLADFYLVNG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=:Alanine, C~Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K>Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y -Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LVLESRKRREHLSEEDILRNKAIMESLSKGGNIMEQNFEPIRRQ 
SLT PP PQNT I TW E E Y I S AENGKAPH LGRELVCKE S KKTFKAT I A 
MSQEFPU3IELLLNVLEWAPFKHFNKLREFVQMKLPPGFPVKL 
DI PVFPTITATVTFQEFRYDEFDGS I FTTPDDYKEDPSRFPDL 


6528 


1 


1073 


LTGPAAAEPRCAADAGMKRALGRRKGVWLRLRKI LFCVLGLYIA 
IPFL I KLCPG I QAKLI FLNFVRVPY FI DLKKPQDQGLNHTCNYY 
LQPEE D VT IG VWHT VP AVWWKNAQG KDQMWYEDALASSHP 1 1 LY 
LHGNAGTRGGDHRVELYKVLSSLGYHWTFDYRGWGDSVGTPSE 
RGMT YDALHVFDW I KARSGDN P VY I WGHS LGTGVATNLVRRLCE 
RETPPDALILESPFTNIREEAKSHPFSVIYRYFPGFDWFFLDPI 
TSSGZ KFANDENVKHISCPLLI LHAEDDPWPFQLGRKLYS IAA 
PARSFRDFKVQFVPFHSDLGYRHKYIYKSPELPRILREFLGKSE 
PEHQH 


6529 


363 


2215 


THI R YNKIGWKTMSCGNEFVETLKK IG Y PKADNLNGEDFD WLF 
EGVEDESFLKWFCGNVNEQNVLSERELEAFSILQKSGKPILEGA 
AIJDEALKTCKTSDLKTPRLDDKELEKLEDEVQTLLKLKNLKIQR 
RNKCQLMASVTSHKSLRLNAKEEEATKKLiKQSQG I LNAMITKI 3 
NELQALTDEVTQI^MFFRHSNLGQGTNPLVFLSQFSLEKYLSQE 
EQSTAALTLYTKKQFFQGIHEWESSNESQFFNFLKIQTPS I CD 
NQE I LEERRLEMARLQLAY I CAQHQLIHLKASNSSMKSS I KWAE 
ESLHSLTSKAVDKENLDAKISSLTSEIMKLEKEVTQIKDRSLPA 
WRENAQ LLKMP WKGDFDLQ1AKQDYYTARQE LVLNQLI KQKA 
S FELLQLS YE IELRKHRD I YRQLENLVQEbSQSNMMLYKQIiEMIj 
TDPSVSQQINPRNTIDTKDYSTHRLYQVLEGENKKKELFLTHGN 
LEE VAEKLKQNI SLVQDQLAVS AQEHS FFLSKRNKDVDMLCDTL 
YQGGNQLLLSDQELTEQFHKVESQLNKLNHLLTDILADVKTKRK 
TLANNKLHQMEREFYVYFLKDEDYLKDIVENLETQSKI KAYS L3 
D 


6530 


128 


2986 


GAAHHG AI VQ VHPLLPGS S T I M I HDLCLVF PAPAKA WYVS D I Q 
ELYIRVVDKVEIGKTVKAYVRVLDLHKKPFLAKYFPFMDLKLRA 
ASPIITLVALDEALDNYTITFLIRGVAIGQTSLTASVTNKAGQR 
I NS APQQ I EVF PPFRLMPRKVTLLIGATMQVTSEGG PQPQSN I L 
FS I SNES VALVS AAGL VQGLAI GNGT VSGLVQAVDAETGKWT I 
S QDLVQ VEVLLLRAVR I RAP Z MRMRTGTQMP I YVTG I TNHQN P F 
S FGNAVPGLTFHWS VTKRDVLDLRGRHHE AS IRLPSQYNFAMNV 
LGRVKGRTGLRAVVKAVDPTSGOLYGLAREIiSDE I QVQ VFEKLQ 
LLNPBIEAEQILMSPNSYI KLQTNRDGAASLSYRVLDGPEKVPV 
VHVDEKG FLASGSM IGTST I EVT AQEPFGANQTI I VAVKVS PVS 
YLRVSMSPVLHTQNKEALVAVPIiGMTVTFTVHFHDlJSGDVFHAH 
S S VLNFATNRDDFVQ IGKGPTNNTCVVRTVS VGLTLLR VWDAKH 
PGLSDFMPLPVLQAI S PELSGAMWGDVLCLATVLTS LEGLSGT 
WSSSANS I LHIDPKTGVAVARAVGS VTVY YE VAGHLRTYKE WV 

S VPQRI MARHLHP IQTSFQEATAS KVI VAVGDRSSNLRGECTPT 

fiPRVTOAT.WDPTT.T Qr*nQnTTtrT>ZiT/irPT?t)COr>\?V , TTrr , DnTrr^ r P7\ T r* 
Mxvc* v j.yrtjjrif jj x u_l ovyoyr P>.cJ±v r uf roywvr J. v tryr U i ** i .1 -i 

QYFCSITMHRLTDKQRKHLSMKKTALWSASLSSSHFSTEQVGA 
EVPFS PG LFADQAE I LLSNH YTS SE I RVFGAPEVLENLEVKSG S 
PAVLA FAKEKS FG W PS F I TYTVGVLDPAAGSQGPLSTTLTFS S P 
VTNQAIAI PVTVAFWDRRGPGPYGASLFQHFLDSYQVMFFTLF 
ALLAGTAVMI IAYHTVCTPRDLAVPAALTPRASPGHS PHYFAAS 
SPTSPNALPPARKASPPSGLWS PAYASH 


6531 


845 


1425 


PSASIPPSASPDPVPDIRTCHFCLVEDPSVGCISGSEKCTISSS 
SLCMVITIYYDVKVREIVRGCGQYISYRCQEKRNTYFAEYWYQA 
QCCQYDYCNS WSS PQLQSSLPEPHDRPLALPLSDSQI QWFYQAI* 
NLSLPLPNFHAGTEPDGLDPMVTLSLNLGIiSFAELRRMYLFIjNS 
SGLLVLPQAGLLTPHPS 


6532 


2 


954 


AAGPPSEWNQDSLFPEPEPGPAPQVLLGPQGPGLIKGVAPPTL 



518 



BNSDOCID: <WO 01 53312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid * 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(A=Alanine, C= Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I»i sol eucine, K=Lysine, 
L=l,eucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=rArginine, 
S=Serine, T=Threonine, V=Valine, 
W==Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ITDSTGTHJjVLTVTNKNAHSPGLSRGSPQQPSSQPGSPAPAPSA 
QMDLEHPLQPLFGTPTSLLKKEPPGYEEAMSQQPKQQENGSSSQ 
QMDDX.FDILIQSGEISADFKEPPSLPGKEKPSPKTVCWSPLAAQ 
PSPSAELPQAAPPPPGSPSLPGRLEDFLESSTGLPLLTSGHDGP 
EPLSLrDDLHSQMLSSTAILDHPPSPMDTSELHFVPEPSSTMGL 

DLADGHLDSMDWLELSSGGPVLSLAPLSTTAPSLFSTDFIiDGHD 
LQLHWDSCL 


6533 


1798 

- 


373 


STISWLARVEPPRRSSGVGAARLRFPGGSRPLRARACVLALtAVXj 
ALLERNNADSMSAHS MI*CE R X A I AKEI» I KRAESLS RSRKGG I EG 
GAKLCS KUKAELKFLQKVEAG KVAI KESHLQSTNLTHLRAI VES 
AENLEEVVSVLHVFGYTDTLGEK0TLVVDWANr;r:w r rvn7'i < ra Trr> 
KAEALHNI WLGRGQYGDKS 1 1 EQAEDFLQASHQQ P VQ YSNPHI I 
FAFYNS VSS PMAEKLKEMG I S VRGDI VAVNALLDHPEEIjQPS ES 
ESDDEG PELLQVTRVDREN I LASVAFPTE IKVDVCKR VNI*DI TT 
LITYVSALSYGGCHFIFKEKVLTEQAEQERKEQVLPQLEAFMKD 
KELFACESAVKDFQSILDTLGGPGERERATVI/IKRINWPDQPS 
ERAIiRLVAS S KINS RSLTT FGTGDTL KAI TMT ANSG FVRAANNQ 
GVKFSVFIHQPRALfTESKEAIiATPLPKDYTTDSEH 


6534 


47 


596 


KATRFISAAWVLNKQGVSPAKLPHTSWSWSLQTLSFLFSGDLA - 
EKSLQCFPCSAMLLELI PLLG I HFVLRTARAQS VTQPDIH ITVS 
EGASLELR CN YS YGATP YLFWMERTVE EAF I LL»VCLKP WRVAS S 
IiEKKEKEDES FQLIiLGSRYNVliKAHCIiliPLI RWLTSGDSLLS AO 
PHCPQGL 


6535 


250 


964 


LIKTFFRDVAIQRDLLPKBKNLETLLTLAFLEIDKAFSSHARLS w 
ADATLLTS GTTATVALLRDG I EL. WA QW?nc J? A T T n> v-nvTiMTrr 

TIDHTPERKDEKERI KKCGG F VAWNSLGQ PHVNGRLAMTRS I GD 
LDLKTSGVI AE PET2CRIKLHHADDS FI*VI#TTDG INFM VNSQE I W 
D FVNQCHDPNEAAHAVTEQAI Q YGTEDNS TAVWPFGAWGKY KN 
SEINFS FSRS FASSGRWA 


6536 


242 


1174 


SLVKEMTNQYGILFKQEOAHDDArWSVAWGTNKKENSETWTGS 
LDDLVKVWKWRDERbDLQWSLEGHQLGWSVDISHTLPlAASSS 
LDAH1RLWDLENGKQIKS I DAGPVDAWTIAFSPDSQY LATGTHV 
GKWIFGVESGKKEYSIJ3TRGKFILSIAYSPDGKYLASGAIDGI 
INI FT)I ATGKLLHTLEGHAMP IRSLTFS PDSQLLVTASDDGYIK 
I YDVQHANIiAGTLSGHASW VLNVAFCPDDTHFVSS SS DKS VKVW 
DVGTRTCVHT FFDHQDQVWG VKYNGNGS KI VSVGDDQE IH I YDC 
PI 


6537 


1638 


921 


NRFNPPPTQGPDPSLVYRPDVDPE VAKDKAS FRNYTSGPLLDRV 
m-YKLrWTHO/TVDFVRSKHAQFGGFSYKKMTVMFAVDLLDGLV 
DE SDPDVDFPNS FHAFQTAEG I RKAHPDKDW FHLVGLLHDLGKV 
LALFGE PQ WAWGDTFP VGCR PQAS WFCDS TFQDNP DLQDPR Y 

STELGMYQPHCGLDRVLMSWGHDGEARGGQWGGGGRWGTVGGGG 
AEAVPAGDTIiSPQSTCTR 


6538 
6539 


3345 
218 


2412 
339 


P YL YDFIiDAb I TCQTAPE EAF I KIjDGLAGM LTEQLjRRIjTKQVQE 
ARHNRDDEAIKKAVNEYDETMEKYXPVLMAOAKIYWNLENYPMV 
E K I FRKS VE FCNDHD VWKLNVAHVLFMQEN KYKEAIG FYE P I VK 
KHYDWXLNVSAIVIANLCVSYIMTSQNEKAEEIiMRKIEKEEEQL 
SYDDPNRKMYHLCIVWLVIGTDYCAKGNYBFGISRVIKSLEPYN 
KKLGTDTW Y YAKRCFIjSI<LENMS XHMI VIHDS VI QECVQFLGHC 

ELYGTNI PAVI EQPLEEERMHVGICNTVTDESRQLKALI YE I IGW 
NK 

FLGAAS PH PH FS SIAPHPDQP EFT P VQDELEAMELWGPGV 1 


6540 


3 


391 " 


LERLWLLLLRRPEDAHAECPTLGEAVTDHPDR1,WAWEKFVYLDE 
KQKAWLPLTIEIKDRLQL^VIiRREDWLGRPm'PTQIGPSLLP 
IMWQLYPDGRYRSSDSSFWRLVYHIKIDGVEDMI,L,ELLPDD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid, segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T~Threonine, V=Valine, 
Wt=Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /-possible nucleotide deletion, 
\=pos3ible nucleotide insertion) 


6541 


1165 


536 


RTLVQRR I bM LLR KPARGRDLRGRGRGT PRGGRKGLLPTPDE F P 
RFEGGRKPDSWDGNREPGPGHEHFRDTPRPDHPPHDGHSPASRE 
RSSSLQGMDMASLPPRKRPWHDGPGTSEHREMEAPGGPSEDRGG 
KGRGGPGPAQRVPKSGRSSSLDGEHHDGYHRDEPFGGPPGSGTP 
SRGGRSGSNWGRGSNMNSGPPRRGASRGGGRGR 


6542 


3 


3775 


SWPRGRGETGGHPGALRTRTMQKSVRYNEGHAIjYLAFIiARKEGT 
KRGFLiS KKTAEAS R WHEKWFAL YQNVJjF YFEGEQSCRPAGM YI»L 
EGCSCERTPAPPRAGAGOGGVRDAIiDKQYYFTVLFGHEGQKPLE 
IiRCEEEQDGKEWMEAIHQAS YADI LIEREVLMQKY IHLVQI VET 
EKIAANQLRHQLEDQDTEIERLKSEIIALNKTKERMRPYQSNQE 
DEDPDI KKI KKVQSFMRGWLCRRKWKTI VQDYICSPHAESMRKR 
NQ I VFTMVEAESEYVHQLY ILVNGFIiRPLRMAASSKKPP I SHOD 
VSS IFLNSETIMFLHE I FHQGLKARI ANWPTIjI ladlfdi llpm 
LNIYQEFVRWHQYSLQVLANCKQNRDFDKLIiKQYEANPACEGRM 

letfltypmfqi pryi itlhellahtphehverkslefaks KKE 
ei^rwhdevsdtenirknlaiermivegcdilldtsqtfirqg 
sliqvpsvergkxskvrlgslslkkegerqcflftkhflictrs 

SGGKLHLLKTGG VLSLI DCTLIEEPDASDDDS KGSGQVFGHIjDF 
KIWEPPDRAAFTWLLAPSRQEKAAWMSDI sqcvdnircnglm 
TIVFEENSKVTVPHMIKSDARLHKDDTDICFSKTLNSCKVPQIR 
YASVERLLERLTDLRF1»S IDFLNTFLiHTYRIFTTAAWLGKLiSD 
XXKKt'l* 1 tt ±F VRSIjEIiFFATSQERUZGEHLw 

PPLAVSRTSS PVRARKLSLTSPI*NSKIGALDLTTSSS PTTTTQS 
PAASPPPHTGQIPLDLSRGbSSPEQSPGTVEENVDNPRVDLCNK 
LKRSIQKAVLESAPADRAGVESSPAADTTELSPCRSPSTPRHLR 
YRQPGGQTADNAHCS VS PASAFAI ATAAAGHGS PPGFNNTERTC 
DKEFI I RRTATNRVLNVLRHWVS KHAQDFELNNEI>KMNV1»NLLE 
EVLRDPDIXPQERKAAANILMALSQDDQDDIHIiKIiEDI IQMTDC 

ni\A£iV,f C*OJUOrtl v lC>J_irt-n»SJ x 1 ijLtJJrl V ±C rCoJ- trX CiCjC i~rc> *> n ft 1 IS. 

NERTP Y I MKTSQH FNDMSNliVASQ I MNYADVS SRANAI EKWVAV 
ADI CRCI1HNYNGVI1E ITS ALNRSA1 YRLKKTWAKVSKQTKALMD 
KLQKTVSSEGRFKNIiRETLKNCNPPAVP YLGMYLTDLAFI EEGT 
PNFTEEGL*VNFS KMRMISHI IREIRQFQQTSYRI DHQPKVAQYL 
LDKDL 1 1 DEDTLYELSLKI E PRLPA 


6543 


1857 


950 


F VS GCGRAG I GLS WAMAAEAR VSR W Y FGGLAS CGAACCTHPLDL 
LKVHLQTQQE VKLRMTGMALRWRTDG I LiALYSGLSAS LCRQMT 
YSLTRFAIYETVIUJRVAKGSQGPIjPFHEKVIjIiGSVSGIiAGGFVG 
TPADI,VNVRMQNDVKIiPQGQRRNYAHALDGLYRVAREEGI,RRI,F 
SGATMASSRGALVTVGQLSCYDQAKQLVLSTGYIiSDNIFTHFVA 
SFIAGGCATFLCQPLDVIiKTRLMNSKGEYQGVFHCAVETAKI^GP 
IAFYKGLVPAGIRLIPHTVLTFVFLEQLRKNFGI KVPS 


6544 


630 


79 


PSPCFI RSRLDGQPWMAGLEAWLSQNFSLHQPQSRVRVRRAS I S 
EPSDTDPEPRTLNPSPAGWFVQQHPELELMSSFRERFGRNWLQY 
RSHLEPSGNPLPATPTTSAPSAPPASSQGPDTAPRPSPPQEEAR 
GPQES PQ KMSEE VRAEPQEEE EEKEG KEEKEEGEMAPL PEAHL.G 
EGKQKECP 


6545 


176 


560 


PPHSHAAIiLPAAMTPLLTLI LWLMGLPLAQALDCHVCAYNGDN 
CFNPMRCPAMVAYCMTTRTYYTPTRMKVSKSCVPRCFETVYDGY 
S KHAS TTS CCQ YDI*CNGTGI»ATPATIiALAP II*I»ATL WGLL 


6546 


1657 


364 


HLLNGLDEVAAFFVADLGAIVRKHFCFLKCLPRVRP FYAVKCNS 
S PGVLKVliAQLGIiG FSCANKAEMELVQHI G I PASKI I CANPCKQ 
IAQIKYAAKHGIQLLSFDNEME1JUCVVKSHPSAKMVLCIATDDS 
HSLSCLSI>KFGVSLKSCRHLLENAKKHHVEWGVSFHIGSGCPD 
PQAYAQS I ADARLVFEMGTELGHKMHVLDLGGGFPGTEGAKVR F 
EEI ASVINSALDLYFPEGCGVDI FAEIiGRYYVTS AFTVAVSI I A 
KKEVLLDQPGREEENGSTS KTIVYHLDEGVYGI FNS VLFDNICP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K= Lysine, 
Leucine, M=Methionine, N=Asparagine, 
P=Prol ine, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine , V-Valine, 
W=Tryptophan, Y=Tyrooine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TPILQKKPSTEQPLYSSSLWGPAVDGCDCVAEGLWLPQLHVGDW 
L VFDNMG AYTVGMGS P FWGTQACH I T YAMS RVAWEALRRQLMAA 
EQEDDVEGVCKPLSCGWEITDTLCVGPVFTPASIM 


6547 


1 


541 


LHS K YLAPALCSQPGMMRCCRRRCCCRQPPHALRPLLLLPLVLL 
PPLAAAAAGPNRCDTIYQGFAECLIRLGDSMGRGGELETrCRSW 
NDFHACAS QVLSGC P EE AAAW7ES LQQEARQAPRPNNLHTLCGA 

P VHVRERGTGS ETNQETLRATAPAIjPMAPAPPLIAAALAIiAYLjL 
RPLA 


6548 


2 


219 


F VS RL»S VRD VR FPT FI .GfSHfS anaMUTnfanv'c'fl a vt?nTctpr>kr<r>r» — 
I KG CGI TFTLGKG TE VGELK I hSR FQNA 


6549 


73 


1490 


etgrvcedarpacgsrsrrrrkeaapgiptpspssssptssrpa 
arafskaparlsrprareeppdpgrryiqeeiiqarkhklikmc 
s s vaaklw fiitdrr ire d ypqke i lralkakcceeeld fra wm 
dewtiti eqgnlglringel i taypqvwvrvptpwvqsdsdit 

VLRHLiEKMG cr lmn R PQA I LNCVNKFWTFQELAGHGVPLPDTFS 
YGGH2NFAKMIDEAEVLEFPMWKNTRGHRGKAVFLARDKHHLA 
DLSHLIRHEAPYLFQKYVKESHGRDVRVIWGGRWGTMLRCST 
DGRMQSNCSLGGVGMMCSLSEOGKQLAIQVSNILGMDVCGIDIiL 
MKDDGSFCVCEANANVGFI AFDKACNLDVAG I IADYAASLLPSG 
i\u ± KATiiiiiji) Wb I /\&ET5EPELGPPASTAVDNMS ass ss vdsd 
PESTEREIiLTKIjPGGLFNMNQLLANEIKLLVD 


6550 


2293 


922 


FRVSREX^PDCXSIEQMGIiAMEHGGSYARAGGSSRGCWYYLRYFF 
LFVSLIQFLIILGLVLFMVYGNVHVSTESNLQATERRAEGLYSQ 
LLGLTASQSNLTKELNFTTRAKDAIMQMWLNARRDLDRINASFR 
yLyijUKVi i IfsliNt ^jK IHAAi 1 L>£a EKOCRDOFKDMMKS PDAT .T . FMIf 

NQKVKTLEVEIAKEKTICTKDKESVLLNKRVAEEQLVECVKTRE 
LQHQERQIiAKEQLQKVQALCLPLDKDKFEMDLRNLWRDSIIPRS 
LDNI^Yl^YHPLGSELASIRRACDHMPSI^SKV^EI^SLRAD 
IERVARENSDLQRQKLEAQQGLRAS QEAKQKVEKEAQAREAKLQ 
AECSRQTQIjAI»EEKAVtiRKERnNT.aiCFT PPinfPP&PnT dmct at 
RNS ALDTC I KTKSQPMMPVS RPMGP VPNPQP I DPAS LEEFKRKI 
LESQRPPAGI PVAPSSG 


6551 


157 


748 


IQP PDPRNMTLAAYKE KMKELPLVS LiFCSCFLADPLNKS S YKYE 
ADTVDLNWCVI S DMEV I ELNKCTSGQS FE VI I>KP P S FDG VPE FN 
ASLPRRRDPSLEEIQKKLEAAEERRKYOEAELLKHIxAEKREHER 
EVI QKAI EENNN F I KMAKE KLAQKMESNKENREAHIiAAMLERLQ 
EKDKHAEEVRKNKELKEEASR 


6552 


157 


748 


IQP PD PRNMTLAA Y KEKMKELPLVS LFCS CFLAD PLNKS S YKYE 
AIMVDIJJWCVISDMEVIELNKCTSGQSFEVILKPPSFDGVPEFN 
ASLPRRRDPSLEEIQKKLEAAEERRKYQEAELLKHLAEKREHER 
EVIQKAIEENNNFI KMAKEKIAQKMESNKENREAHLAAMLERJ^Q 
EKDKHAEEVRKNKELKEEASR 


6553 


2 


1807 


FVWS KMAAHLS YGRVNLNVLREAVRRELREFLDKCAGSKAI WD " 
EYLTGPFGLIAQYSLLKEHEVEKMFTLKGNRLPAADVKNI I FFV 
RPRLELMO I IAENVLS EDRRG PTRDFH I LFVPRRS LLCEQRLKD 
I^VLGSFIHREEYSLDLIPFDGDLLSMESEGAFKECYLEGDQTS 
L YHAAKGLMTLQAL YGTI PQ I FGKGE CARQ VANMM I RMKRE FTG 
SONS I FPVFDNLLLLDRNVDLLTPLATQLTYEGLI DEI YGIQNS 
YVKLPPEKFAPKKQGDGGKDLPTEAKKLQLNSAEELYAEIRDKN 
FNA VGS \TL S KKAK 1 1 SAAFEERHNAKTVGE I KQFVSQLPHMQAA 
RGSLANHTSIAELIKDVTTSEDFFDKLTVEQEFMSGIDTDKVNN 
Y I EDC I AQKHSL I KVLRLVCLQS VC^SGIiKQKVIjD Y YKRE I LQT 
YGYEHILTLHNLEKAGLLKPQTGGRNNYPTIRKTLRLWMDDVNE 
QNPTOISYWSGYAPLSVRLAQlil.SRPGWRSIEEVI.RILPGPHF 
EERQPIjPTGL^KKRQPGENRVTLIFFLGGVTFAEIAAIjRFIjSQI. 
EDGGTE YV IATTKLMNGTS W I EAliMEKPF 
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SEQ 
ID 
NO: 


"Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide "" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=*Isoleucine, K-Lysine, . 
I>=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Tbreonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X= Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6554 


119 


1244 


FEMGSQVSVESGALHWIVGGGFGGIAAASQLQALNVPFMLVDM 
KD S FHHNVAALRAS VETG FAKKT FI S YS VTFKDNFRQGI* WG I D 
LKNQMVIiLQGGEALPFSHLIIATGSTGPFPGKFNEVSSQQAAIQ 
AYEDMVRQVQRSRF I VVVGGGSAGVEMAAE I KTE YPEKEVTLIH 
S QVALADKELLPS VRQE VKE I LLRKG VQLLLS ERVSNLBELPLN 
EYREYI KVQTDKGTE VATKIjVIIiCTG I KI NS S AYRKAFES RLAS 
SGALRVNEHLQVEGHSNVYAIGDCADVRTPKMAYIjAGLHANIAV 

anivnsvkqrplqaykpgaltfllsmgrndgvgqisgfyvgrlm 
vrltksrdlfvstswktmrqspp 


6555 


1552 


498 


IHMALLRKIKQVLLFLLIVTLCVILYKKVHKGTVPKNDADDESE 
TPEELEEEIPWICAAAGRMGATMAAINSIYSNTDANILFYWG 
LRNTLTRIRKWIEHSKLREINFKIVEFNPMGLKGKIRPDSSRPE 
LLQPLNFVRFYLPLLIHQHEKVIYLDDDVIVQGDIQELYDTTIA 
LGHAAAFSDDCDLPSAQD INRLVGLQNTYMGYLD YRKKAI KDLG 
IS PSTCS FNPGVI VANMTEWKHQRITKQLEKWMQKNVEENLYSS 
SLGGGVATS PML I VFHGKYSTIN PI*WH I RHLGWN PDAR YS EHFL 
QEAKLiliHWNGRHKPWDFPSVHNDLWESWFVPDPAGIFKLWHHS 


6556 


241 


1449 


ASIiCKGCFFVTHVLVI I LPSLQSPPTFGFLLDIDGVLVRGHRVI 
PAALKAFRRLVNSQGQIJRVPVVFVTNAGNILQHSKAQEIjSALLG 
CEVDADQVILSHSPMKLFSEYMEKRMbVSGQGPVMENAQGLGFR 
mAm^El^MAFPLLDMVDLERRLKTTPIiPRNDFPRIEGVLbliG 
EPVRWETSLQL I MD VLLSNGS PGAGLATPP YPHI»PVLASNMDLI> 
WMAEAKMPRFGHGTFIJUCLETIYQKVTGKELRYEGLMGKPSILT 
YQYAEDLI RRQA ERRGWAAP I R KL YAVGDNPMS DVYGANL FHQ Y 
LQKATHDGAPELGAGGTRQQQPSASQSCISILVCTGVYNPRNPQ 
STE PVLGGGEP P FHGHRDLCFS PGLMEASHWNDVNEAVQLVFR 
KEGWALE 


6557 


2590 


1534 


RMCGRTSCHLPRDVLTRACAYQDRRGQQRLPEWRDPDKYCPSYN 
KSPQSNSPVLLSRLHFEKDADSSERIIAPMRWGLVPSWFKESDP 
SKLQFNTTNCRSDTVMEKRSFKVPLGKGRRCVVIiADGFYEWQRC 
QGTNQRQPYFIYFPQIKTEKSGSIGAADSPENWEKVWDNWRLLT 
MAGIFDCWEPPFX3GDVLYSYTIITVDSCKGLSDIHHRMPAILDG 
EEAVS KWLDFGEVS TQEALKL I HPTENI TFHAVS S WNNSRNNT 
PECLAPVDLWKKELRASGSSQRMLQWIATKSPKKEDSKTPQKE 
ESDVPQWSSQFLQKSPLPTKRGTAGLLBQWLKREKEEEPVAKRP 
YSQ 


6558 


21 


113 8 


FHGRRRGGRKMELGSC^EGGREAAEEEGEPEVKKRRLLCVEFAS ' 
VASCDAAVAQCFIiAENDWEMERALNS YFE PPVEESALERRPET I 
SEPKTYVDLTNEETTDSTTSKISPSEDTQQENPSMFSLITWNID 
GLDLNNLSERARGVCSYIALYSPDVIFLQEVIPPYYSYLKKRSS 
WYE 1 1 TGHE EG Y FTA IMLKKSRVKLKS QE 1 1 PFPSTKMMRNLLC 
VHVNVSGNELCLMTSHLESTRGHAAERMNQLXMVLKKMQEAPES 
ATVIFAGDTNLRDREVTRCGGLPNNIVDVWEFLGKPKHCQYTWD 
TQMNSNLG I TAACKLRFDR I FFRAAAE EGH 1 1 PRSLDLLGLEKb 
DCGRFPS DHWGT .T .CNIiDI I L 


6559 


3 


364 


GPELSGLPTRPKKLKANQTPIAMDCCASRSCSVPTGPATTICSS 
DKSCRCGVCLPSTCPHTVWLLEPTCCDNCPPPCHIPQPCVPTCF 
LLMSCQPTPGLETLNIjTTFTQPCCEPCIiPRGC 


6560 


3 


1435 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQVRDTSSRIAKG 
GVDHTKMSLHGASGGHERSRDRRRSSDRSRDSSHERTESQLTPC 
IRNVTS PTRQHHVEREKDHSS SR P S S PRPQKAS PNGS XS SAGNS 
SRNSSQSSSDGSCKTAGEMVFVYENAKEGARNIRTSBRVTLIVD 
NTRFWDPS I FTAQPNTMLGRMFGSGREHNFTRPNEKGEYEVAE 
GIGSTVFRAI LDYYKTG I IRCPDG I S I PELREACDYLCISFEYS 
TIKCRDLSAI^ELSNDGARRQFEFYIJSEMILPIjMVASAQSGER 
ECHrWLTDDDWDWDEEYPPQMGEEYSQlIYSTKLYRFFKYIE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corrs s pondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence . 


* «*• CUJ. W %m did 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AcAlanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanins, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine. R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\-pcssiDie nucleotide insertion) 








NRDVAKSVLKERGLKKIRLGIEGYPTYKEKVKKRPGGRPBVIYN 
YVQRPFIRMSWEKEEGKSRHVDFQCVKSKSITNLAAAAABIPQD 
QLWMHPTPQVDELDI LPIHP PSGNSDLDPDAQNPML 


6561 
6562 


3 


j 1066 


PGRRFRRKESSSSRWFPADCLLGLRGPASSLLSPEPSPSWPSHS 
PCPMAALTDLSFMYRWFKNCNLVGNLSEKYVFITGCDSGPGNLL 
AKQLVDRGMQVLAACFTEEGSQKLQRDTSYRLQTTLLDVTKSES 
IKAAAQVTODKVGEQGLWALVNNAGVGIiPSGPNEWLTKDDFVKV 
IWVNLVGLI 2 VTLHM IjPMVKRARGRVVNMS SSGGRVAVI GGG^C 
VSKFGVEAFSDS I RRELYYFGVKVCI I E PGN YRTAI LGKENLES 
RMRKLWERLPQETRDS YGEDYFR I YTDKLKNIMQVAF. PRVRDVI 

NSMEHAIVSRSPRIRYNPGLDAKLLYIPLAKLPTPVTDFILSRY 
LPRPADSV 


6563 


1 


1562 


MSTLYDIRAHKAQLLRFFASSDSNKALEQRRTLHTPKLEHLDRV 
1»YEWFLGKRS EGVPVSG PMLI EKAKDFYEQMQL»TEPCVFSGGWIi 
WR FKARHG I KKLDAS S E KQSADHQAAEQ FCAFFR S LAAEHGLS A 
EQVYNADETGLFWRCLPNPTPEGGAVPGPKOGKDRLTVLMCANA 
TGSHRLKPLAXGKCSGPRAFKGlQHIiPVAYKAQGNAWVDKEIFS 
DWFHHIFVPSVREHFRTIGLPEDSKAVLLLDSSRAHPQEAELVS 
SNVFTIFLPASVASLVQPMEQGIRRDFMRNFINPPVPLQGPHAR 
YNMNDAI FS VACAWNAVPSHVFRRAWRKI*WPS VAFAEG S SSEE E 
LEAECFPVKPHNKSFAHILELVKEGSSCPGQLRQRQAASWGVAG 
REAEGGRPPAATSPAEWWSSEKTPKADQDGRGDPGEGEEVAWE 
QAAVAFDAVLRFAERQPCFSAQEVGQLRALRAVFRSQQQVRRRR 
G AI*G A WKVEALQEG PGGCGATAQS PLP CS S TAGDN 


6564 


1319 


2694 


LAR PAQP VLLRE PEG AGPP VPAGHL VHHLQGGHLRERAH P DL.E A ' 
HEHPLPCDOMFWRQMGGHLRMVEANSRGWWGIGYDHTAWYTG 
GYGGGCFQGLASSTSN1YTQSDVKCVHIYENQRWNPVTGYTSRG 
LPTDRYMWSDASGLQECTKAGTKPPSLQWAV7VSDWFVDFSVPGG 
TDQEGWQYASDFPASYHGSKTMKDFVRRRCWARKCKLVTSGPWL 
EVP P I ALRDVS 1 1 PES PGAEGSGHS I ALWAVSDKGDVLCRLGVS 
ELN PAGSS WLHVGTDQP FAS I S I GACYQVWAVARDGSAF YRGSV 
YPSQPAGD CWYH I PSP PRQRLKQVSAGQTS VYAIiDENGNLW YRQ 
GITPSYPG^SSWEHVSNWVCRVSVGPLDQVWVIANKVQGSHSLS 
RG TVCHRTGVQPHEPKGHGWDYG I GGGWDH I S VRANATRAPRSS 
SQEQEPSA P PEAHG PVCC 


6565 ' 


1 


975 


APGSC^GWSYCGRGWSRAMRGC^I^LRJSSWPGDLLSARiLSQE 
KRAAETHFGFETVSEEEKGGKVYQVFESVAKKYDVMNDMMSLGI 
HRVWKDLIJIiWKMHPIiPGTQLIjDVAGGTGDIAFRFIjNYVQSQHQR 
KQ KRQLRAQQNLS WEE IAKE YQNEEDSIX3GS RVVVCDINKEMtJC 
VGKQKALAQGYRAGLAWVLGDAEELPFDDDKFDI YTIAFG I RNV 
THIDQALQEAHRVLKPGGRFLCLEFSQVNNPLISRLYDLYSFQV 
I P VIiG E VIAGDWKS YQ YLVES IRRFPSQEEFKDMI EDAGFHKVT 
YESLTSGXVAIHSGFKL 


6566 " 


1464 


999 


RSAVANGLTKRRMGLKLNGRYI SL I IAVQIAYLVQAVRAAG KCD 
AVFKGFSDCLLKLGDSMANYPQGLDDKTNIKTVCTYWEDFHSCT 
VTALTpCQEGAKDMWDKLRKES KNLNIQGSLFELCGSGNGAAGS 
LLPAFPVIiLVSLSAAIATWLSF 




3 


1385 


KYESAQPGGTQPEPGLGARMAIHKALVMCLGLPLFi*FPGAV7AQG 
HVPPGCSQGI^PLYYNLCDRSGAWG I VLEAVAGAG I VTTFVLT I 
I LVASLP FVQDTKKRS LLGTQVFFLIfGTLGLFCIiVFACVE KPDF 
STCASRRFLFGVLFAICFSCLAAHVFALNFLARKNHGPRGWI F 
TVALLLTIiVEVl INTEWL 1 1 TLVRGS GEGGPQGNS S AGWAVAS P 
CAIANMDFVMm J lYVMI J ^J J I i GAFIJ2AWPJ^CGRYKRV!RKHGV^ 
LLTT ATS VAI WWWI VM YTYGNKQHNS PTWDDPTLAI ALAANAW 
AFVLFYVXPEVSQVTKSSPEQSYQGDMYPTRGVGYETILKEQKG 
3SMFVENKAFSMDEPVAAKRPVSPYSGYNGQLLTSVYQPTEMAL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 

amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre 3 pond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Hi9tidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) j 








MHKVPSEGAYDIILPRATANSQVMGSANSTLRAEDMYSAQSHQA 
ATPPKDGKNSQVFRNPYVWD 


6567 


125 


863 


TKRSNLKAYACS IHH IRTMS YVFVWDSSQTNVPLLQACI DGDFN 
YS KRLLESG FDPN I RDS RGRTGI»HIiAAARGNVDI OQLLHKFX3AD 
LLATD YQGNTALHLCGH VDTI QFL VSNGLK I D I CNHQGAT PLVL 
AKRRG VNKDVT RLLESLE EQEVKG FNRGTHS KLETMQTAE SESA 
MESHS LLN PNLQQGEGVLSS FRTTWQEFVEDLGFWRVLLL I FVI 
ALLSLGIAYYVSGVLPFVENQPELVH 


6568 


3 


1133 


HASDRLbVLPDNYSHFSQASANLQGPSRTTELFHPTLAS ISSPM 
LEGAELYFNVDHGYIjEGLVRGCKASIjLTQQDYINLVQCETLEDI* 
KIHLQTTDYGNFIANHTNPLTVSKIDTEMRKRLOSEFEYFRNHS 
LEPLSTFLTYMTCS YMI DHVI LLMNGALQKKSVKEILGKCHPLG 
RFTEMEAWIAETPSDLFNAILIETPI^FFQDCMSENALDELN 
IELLRNKLYKSYI,EAFYKFCKNHGDVTABVMCPILEFEADRRAF 
IITLNSFGTELSKEDRETLYPTFGKIjYPEGLRLLAOAEDFDQMK 
NVADH YG VYKPLFEAVGGS GGKTLE DVF YERE VQMNVLA FNRQF 
HYGVFYAYVKLKEQEIRNIWIAECISQRHRTKIN3YIPIL 


6569 


205 


1532 


RRRGPQRLGHGRPT PLLCRWRTAG P SHWEKQARAFQGLR P VDPR 
RWSWLFPLTKSASSSAAGSPGGLTSLQQQKQRLIESLRNSHSSI 

aeiqkdveyrlpftinnltininillppqfpqekpvisvyppir 
hhlmdkqgvy vts plvnnftmhsdlgki iqslldefwknppvla 
ptstafpyiiysnpsgmspyasqgfpflppyppqeanrsitslsv 
ai7tvss sttshttakpaaps fg vlsnlplpi ptvdas i pts qng 
fgykmpdvpdafpel^selsvsqltdmneoeevlleqfltiipqlk 
qi itdkddbvks ieeiarknli^psleakrqtvldkyelltqm 
kstfekkmqrqhelsescsasalqari1kvaaheaeeesdniaed 
flegkmeiddflssfmekrti chcrrakeeklqqaiamhsqfha 

PL 


6570 


330 


1304 


ARLPRLTFLREGFLYVLLSHWVFVGAPRPPASDSWKKGLVPSAP 
PASRXMGSKALPAPI PLHPSLQLTNYS FLQAVNTFPATVDHLQG 
LYGLSAVQTMHMNJIWTIjGYPNVHEITRSTITEMAAAO^iVDAR? 
PFPALPFTTHLFHPKQGAIAHVLPALH KDRPRFDFANIAVAATQ 
EDPPKMGDLSKLS PGLGSPI SGLSKLTPDRKPSRGRLPSKTKKE 
F I CKFCGRHFTKS YNLLIHERTHTDERP YTCD I CHKAFRRQDHL 
RDHRYIHSKEKPFXCQECGKGFCQSRTIAVHKTLHMQTSSPTAA 
SS AAKCSGETVI CGGT 


6571 


169 


656 


APDMKRKKLQKLTDTLTKNCKHLFRG FDKDNDGCVNVLEWI HGL 
S LFLRGSLEEKMKYCFE VFDLNGDGF I SKEEMFHMLKNSLLKQP 
SEEDPDEG I KDLVE I TLKKMDHDHDGKLS FADYELAVREETLLL 
EAFGPCLPDPKSQMEFEAQVFKDPNEFNDM 


6572 


49 


1646 


TP ERAQPGALLGAAG CC VCGGRW WPRSHERG YFSS AKMGSKRRN 
LSCSERHQ KLVDBNYCKKLHVQALKNVNSQ I RNQMVQN3NDNR V 
QRKQFLRLI^NEOFEIiDMEEAIQKAEENKRLKELQLKQEEKLAM 

QXAEKDAIKYEQMKRDAEIAKTMMEEHKRII KEENAAEDKRNKA 
KAQYYLDLEKQLEEQEKKKQEAYEQLLKEKLMIDEIVRKIYEED 
QLE KOXJK LEKMNAMRR Y IEE FQKBQ ALWRIOCKREEMEEENRKII 
EFAKMQQQREEDRMAKVQENBEKRLQLQNALTQKLEEMLRQRED 
LEQVRQELYQEEQAE I YKSKLKEEAEKKLRKQKEMKQDFEEQMA 
LKELVLQAAKEEEENFRKTMLAKFAEDDR I ELMNAQ KQRMKQLE 
HRRAVEKLIEERRQQFLADKQRELEEWQLQQRRQGFINAIIEEE 
RLKLLKEHATNLLGYLPKGVFKKEDDIDLLGEEFRKVYQQRSEI 
CEEK 


6573 


767 


275 


GGGGGESQSFRAQDGTRTPATIX:LMYIiGX3PRXLMTQGGYDMVQK 
LFLDFFRRRLSQRPTAEELEQRNI LKPRNEQEEQEEKRE I KRRL 
TRKLSQRPTVEELRERKI LIRFSDYVEVADAQD YDRRADKP WTR 
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SEQ 
ID 
NO: 



6574 



Predicted 
beginning 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



204 



1159 



Aitu.no acia segment containing signal peptide" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutaroine, R^Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 



6575 



117 



820 



6576 



1060 



6577 



2271 



9 87 



6578 



377 



1489 



6579 



711 



~658CT 



1571 



Tl TAADKVSRGECWRVGGRTVCWVSIjGSPI.G SV ~ 

LESS v y vs VFWACGVS WTGAAGIiQD GALSDTMARNAEKAMTA 
LARFROAQLEEGKVKERRPFIASECTELPKAEKWRRQIIGEISK 
KVAQI QNAGLGE FR I RDLNDE INKLLRE KGHWEVRI KELGGPD Y 

GKVGPKWLDHEGKEVPGNRGYKYFGAAKDLPGVRELFEKEPLPP 

PRKTRAELMKAIDFEYYGYLDEDDGVIVPLEQEYEKKLRAELVE 

.KWKAEREARLARGEKEEEEEEEEEINIYAVTEEESDEEGSQEKG 

GDDSQQKFIAHVPVPSQQEIEEALVRRKKMELLQKYASETLQAQ 
SEEARRLLGY 

SPAbASQSGGITEEKMLEPQENGVIDLPDYEHVEDETFPPFPpp- 
ASPERQDGEGTEPDEESGNGAPVPVPPKRTVKRNIPKLDAQRLI 
SERGLPALRHVFDKAKFKGKGHEAEDLKMLIRHMEHWAHRLFPK 
LQ FEDFI DRVE YLGS KKE VQTCLKR I RLDL P I LHEDF VSNNDE V 

AENNEHDVTSTELDPF1>TNLSESEMFASELSISLTEEQQQRIER 
NKQLALERRQAKLP 

PEPQALVGQKKGALRLLV ARLVLTVSAPAKVkRKVLRPVLSWMD 
RETRALADSHFRGLGVDVPGVGQAPGRVAFVSEPGAFSYADFVR 
G FLLPNLPCVFSS AFTQGWGS RRRWVTPAGRPDFDHLLRT YGD V 
VVPVANCGVQEYNSNPKEHMTIiRDYITYWKEYIQAGYSSPRGCIj 
YLKDWHLCRD FP VEDVFTIjP VYFS S DWLNE FWDALDVDD YRFVY 

agpagswspfhadifrsfswsvnvcgrkkwllfppgoeealrdr 
hgnlpydvtspalcdthlhprnqlagppleitqeagemvfvpsg 

WHHQVHNLVMCCFSCPLSGAFU3EDGSTTSPLSQPELGWNGVAH 
G 



aURr^DDyuiVXEAMLEAPYKKEEDEQQRKEVK KDYPSNTTSS 
TSNSGNETSGSSTIGBTSNRSRDRDRYRRRNSRSRSPGRQCRHR 
SRSWDRRHGSESRSRDHRREDRVHYRSPPLATGYRYGHSKSPHF 
REKS P VRE P VDNLSP EERDART VFCMQIiAARI RPRDLEDFFSAV 
GKVRD VR 1 1 SDRNSRRS KG IAYVE FCEI QS VPLAIGI/TGQRIjLG 
VPI I VQASOAEKNRXiAAMANNLQKGWGGPMRLYVGSLHFWITED 
MLRG I FEPFGK I DNI VLMXDS DTGRS KG YG F I TFSDS ECARRAI* 
EQLNGFEIiAGR PMRVGHVTERLBGGTD I TFPDGDQELDLGS AGG 
RFQLMAKLAEGAG I QLPSTAAAAAAAAAAQAAALQLNGAVPL.GA 
LNPAALTALSPALNLASQCLQLS SLFTPQTM 
PSSSATMNRAPIjKRATILHMAI/IGASDPSAEAEANGEKPFIjLiRA 

lqialvvslywvtsismvflnkylldspslrldtpi fvtfyqcl 
vttllckglsalaaccpgavdfpslrldlrvarsvlplswfig 

MITFNNLCLKYVGVAFYIWGRSLTTVT^I^YLLLKG^rTSFYA 
LLTCGI I IGGF>7LGVDQEGAEGTLSWIiGTVFGVI*ASLCVSLNAI 

YTTKVLPAVDGS I wrltfynnvnaciiiFlpllLllgelqalrdf 
aqlgsahfwgmmtlgglfgfaigyvtglqikfts plthnvsgta 

KACAQTVIAVLYYEETKSFLWWTSNMMVLGGSSAYtWVRGWEMK 
KTPEEPSPKDSEKS AMGV 

RPPRVWYPEXjRELSAAAPRVJSHRTAPG IMVFYFTSSSVNSSAY T 
IYMGKDKYENEDLI KHGWPEDI WFHVDKLSSAHVYLRLHKGENI 
EDIPKEV^DCTAHLVKANSIQ^CKMNNVNW 

DVGQIGFHRQKDVKIVWEIOCVNEILNRLEKTKVERFPDLAAEK 
ECRDREERNEKKAQIQEMKKREKEEMKKKREMDELRS YSSLMKV 
ENMSSNQDGNDSDEFM 



IjVAXiKNWKPKGTNI PAPQS PVFGEAV SGVYMMTKVLGMAPVIX3 R" 

RPPQEQVGPLMVKVEEKEEKGKYLPSLEMFRQRFRQFGYHDTPG 

PREALSQLRVLCCEWLRPEIHTKEQILELLVLEQFLTILPQELQ 

AWVQEHCPESAEEAVTLLEDLERELDEPGHQVSTPPNEQKPVWE 

KISSSGTAKESPSSMQPQPI,ETSHKYESWGPLYIQESGEEQEFA 

QDPRKVRDCRLSTQHEESADEQKGSEAEGLKGDIISVIIANKPE 

ASLERQCVNIfENEKGTKPPIjQEAGSKKGRESVPTKPTPGERRYT 



525 



WO 01/53312 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresDonrf'i nex 

w v 4» Mm \+* 1 1 \a _L i * -J 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

J- UaL iuh 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=I>ysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine r 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *«.Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








CAECGKAFSNSSNLTKHRRTHTGEKPYVCrKCXSKAFSHSSNliTli 
H YRTHLVDRP YDCKCG KAFGQS SDLLKHQRMHTEEAPYQCKDCG 
KAFSGKGS L IRH YR I HTGEKP YQCNECGKS FSQHAGliS SUQRLH 
TGEKP Y KCKECGKAPNHS SNFNKHHR I HTGEKP Y WCHHCGKT FC 


6581 


228 


476 


RVFIiKDLSSTPMASKNTAS I AQARKIiVEQLKMEAN IDRI KVSKA 
AADLMAYCEAHAKEDPLDTPVPAS ENPFREKKFFCAI I» 


6582 


1428 


| 718 

; 


CFTTKTHCSPVSVPYbSPLVLRKELESLLENEGDQVIHTSSFIN 
QHPIIFWTLVWYFRRLDLPSNLPGLILTSEHCNEGVQLPLSSLS 
QDSKLVY IQLLWDN INLHQEPREPLYVSWRNFNSEKKSSLLSEE 
QQETSTLVETI RQS 1 QHNNVLKP INLLSQQMKPGMKRQRSLYRE 
ILFLSLVSLGRENIDIEAFDNEYGIAYNSLSSEILERLQKIDAP 
PSASVEWCRKCFGAPLI 


6583 


487 


41 


R I FS MTSGRLRWRCTWR PATALWS ASLRLGTS SMH PS PRS I SL P 
LSMMLSPLPSNTRGLS PTALFRS PDSEHATSCPRLHLWRCRAPI, 
RS P S PLGRIiQVLPRS PLHVHTHNS GKEVLGLQVQRSRSGTGPAC 
SQAGSGAVQGGNWCI F 


6584 


189 


1750 


PLPMAALGPS SQNVTE Y WRVPKNTTKKYNIMAFNAADKVNFAT 
WNQARLERDLSNKKIYQEEEMPESGAGSEFNRKLREEARRKKYG 
I VLKEFRPEDQPWLIiRVNGKSGRKFKGI KKGGVTENTS YYI FTQ 
CPDGAFEAFPVHNWYNFTPLARHRTbTAEEAEEEWERRNKVLNH 
FS I MQQRRLKDQDQDE DEE EKE KRGRRKAS ELR I HDLEDDLEMS 
SDASDASGEEGGRVPKAKKKAPLAKGGRKKKKKKGSDDEAFEDS 
DDGD FEGQ E VDYMSDGS SS SQEEPE S KAKAPQQEEGPKGVDEQS 
DSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSE 
ESDIDSEASSAFFMAKKKTPPKRERKPSGGSSRGNSRPGTPSAE 
GGSTSSTLRAAASKLEQGKRVSEMPAAKRLRbDTGPQSLSGKST 
PQPPSGKTTPNSGDVQVTEDAVRRYIjTRKPMTTKDLI/KKFQTKK 
TGLSSEQTVNVIiAQILKRLNPERKMINDKKHFSLKE 


6585 


3 


1678 


GPIRWSRIDDFVGGDPRAEASCSVl^HSKPHAMADSRDPASDQMQ 
HWKEQRAAQ KADVI »TTG AGN P VGD KLNVT TVG PRG PLLVQDWF 
TDEMAHFDRERI PERWHAKGAGAFGYFEVTHDITKYSKAKVFE 
HIGKKTPIAVRFSTVAGESGSADTVRDPRGFAVKFYTEDGNWDL 
VGNNTP I FFI RDPXLFPSFIHSQKRNPQTHLKDPDMVWDFWSLR 
PES LHQVS FT*FSDRGI PDGHRHMNG YGSHTFIGLVNANGEAVYCK 
FHYKTDOXSIKNLSVEDAARIjSQEDPDYGIRDLFNAIATGKYPSW 
TFY IQVMTFKQAETFP FNPFDIiTKVW PHKDYPIi I PVGKL VLNRN 
P VNYFAEVEQIAFDPSNMPPG I HASPDKMLQGRLFAYPDTHRHR 
LGPN YLH IPVNCPYRARVANYQRDGPMCMQDNQGGAPNY YPNS F 
GAPEQQPSALBHS IQYSGEVHRFNTANDDNVTQVRAFYVNVLNE 
EQRKR1»CENIAGHLKDAQI F I QKKAVKNFTEVH PDYGSHIQALL 
DKYNAEKPKNAIHTFVQSGSHLAAREKANI* 


6586 


32 


804 


PLPEQPAESTSTMPVSGTPAPNKKRKSSKIiIMELTGGGQESSGL 
NLGKKI SVPRDVMIjEEIiSLLTNRGS KMFKLRQMRVEKFI YENH P 
DVFSDS SMDHFQKFIiPTVGGQt.GTAGQG FS YS ksngrggsqagg 
SGSAGQYGSDQQHHU5SGSGAGGTGGPAGQAGRGGAAGTAGVGE 
TGSGDQAGGEGKH I TVFKTY I S PWERAMGVDPQQKMELG IDLLA 
YGAKAELP KYKS FNRTAMP YGG YEKASKRMTFQMP KV 


6587 


75 


1117 


RRVPS LGKM PECWDGEHD I ET P YGLLHWIRGS PKGNR PAI LTY 
HDVGLNHKL.CFNTFFNFEDMQEITKHFWCHVDAPGQQVGASQF 
POG YQ FPS MEQLAAMIiPSWQHFGFKY VI G IG VGAGAYVLAKFA 
L I FPDLVEG LVLVNI DPNGKGW IDWAATKLSGLTS TX PDTVLSH 
LFSQEELVNNTELVQSYRQQIGNVVNQANLQLFWNMYNSRRDLD 
INRPGTVPNAKTLRCPVMLWGDNAPAEDGVVECNSKLDPTTTT 
FLKMADSGGLPQVTQPGKLTEAFKYFLOGMGYMPSASMTRIARS 
RTASLTSASSVDGSRPQACTHSESSEGLGQVNHTMEVSC 
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nucleotide 
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amino acid 
residue of 
amino acid 
sequence 
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amino acid 
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amino acid 
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Amino acid segment containing signal peptide " 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K==I.ysine, 
I>=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S -Serine, T= Threonine, V=Valine, 
W= Tryptophan, Y= Tyrosine, X= Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\ ^possible nucleotide insertion) 


6588 


137 


501 . 


LGLQAQLliEXjRTNNYQLSDELRKNGVELTSLRQKVAYLDKEFSK 
AQKALS KS KKAQEVEVLIiSENEMLQAKLHSQBEDFRLQNSTLMA 
EFSKLCSQMEQLEQENQQLKEGAAGAGVAQAGP 


6589 


2 


1405 


R P WGS AMAT KS RQE FFQQLLQGCLLPTAQOGIjDQ I W LLLA I CI±A 
CRI^WRLGLPSYLKHASTVAGGFFSLYHPFQLHMVVJVVLLSLLC 
YLVLFLCRHSSHRGVFLSVTILIYLIJ^GEMHMVDTVTWHKMRGA 
QMIVAMKAVSLGFDIiDRGEVGTVPSPVEFMGYLYFVGTIVFGPW 
I S FHS YLQAV OGRPLS CRWLQ KVARS LALALLCLVLS TCVGP YL 
FPYFIPLNGDRLLRNKKRKARGTMVRWLRAYESAVSFHFSNYFV 
GFLSEATATLAGAGFTEEKDHLEWDLTVSKPLNVELPRSMVEW 
TSWNLPMSYWLNNYVFKKALRLGTFSAVLVTYAASALLHGFSFH 
LAAVLLS LAF ITYVEHVLRKRLARILSAC VLS KRCPPDCSHQHR 
LGLGVRALNLLFGA1AI FHLAYLGSLFDVDVDDTTEEQGYGMAY 
TVHKWSELSWASHWVTFGCWI FYRLIG 


6590 


2177 


656 


VRAY3HVI>S LLENVFTPMFCHRDE YFRQLLRGAES PTRNSKLNR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFI EEGI WMEDDS P VEAVS TPNT PRNLAAWKI S I P Y 
VDFFEDPSS ER JCEKKER I P VFCIDVERNDRRAVGHEPEHWSVYR 
RYLEFYVLESKLTEFHGAFPDAQLPSKR1IGPKNYEFLKSKREE 
FQEYLQKLLQHPEiSNSQLLADFLS PNGGETQFU3KILPDVNIX5 
KI I KS VPG KLMKEKGQHLEPFI MNFINS CES PKPKPSRPELTI I» 
SPTSENNKKIjFNDLFKHNANRAENTEKKQNQNYFMEVhlTVEGr\^ 
DYLMYVGRVVFQVPDWLHHLLMGTRILFKNTLEMYTDYYLOCKL 
EQLFQEHRLVSLITLLRDAIFCENTEPRSLQDKQKGAKQTFEEM 
MNYIPDLLVKCIGEETKYESIRLLFDGIiQQPVLNKQLTYVLLDI 
V IQELFPELNKVQKEVTS VTSWM 


6591 
6592 


2177 


656 


VRAYEHVI*SIJjENVFTPMFCZHRDEYFRQLI*RGAESPTRNSiGLiNR 
GS I»S LDDFRNTQKRGES FGISRIG S KIKGVFKS TTMEGAMXiPNY 
GVAEGEDDFI EEG IWMEDDS PVEAVS TPNTPRNLAAWKI S I P Y 
VDFFEDPSSERKEKKERIPVFCIDVERNDRRAVGHEPEHWSVYR 
RYLEFYVLESKLTEFHGAFPDAQLPSKRIIGPKNYEFLKSKREE 
FQE YI^KIjLQHPEIjSNSQIjIjAD FLS PNGGETQFIiDKI LPDVNIX3 
KI I KS VPGKLMKEKGQHIiEPFIMNF INSCES PKPKPSRPELTIL 
SPTSENNKKLFNDI^K^^aAl^RABNTERKQNQNYFMEV^f^VEGVy 
DYLM YVGR WF QVPDWLffilljl^GTR ILFKNTLEMYTD YYLQCKL 
EQLFQEHRLVSLITLLRDAIFCENTEPRSLQDKQKGAKQTFEEM 
MN YI PDLLVKCIGEETKYESIRLLFDGLQQP VLNKQLTYVLLDI 
VIQBLFPETiNKVQKEVTSVTSWM 


6593 


3 


1861 


APEFLGSTISSGSMIDANLKLLQEAEORLKAIVAEKFAIATKEG 
DLPQVERFFKIFPLIjGLHEEGIiRKFSEYLCKQVASKAEENLLMV 
LGTDMSDRRAAVI FADTLTLLFEG IARIVETHQP IVETYYGPGR 
LYTLIKYLQVECDRQVEKWDKFIKQRDYHQQFRHVQNNLMRNS 
TTEKI E PRELDP I LTEVTLMNARS BLYLRFLKKR I SS DFE VGDS 
^IASEEVKQEF^QKCLDKLIlNNCIJLSCTMQELIGLY/VTMEEYFMRE 
TVNKAVAIiDTYE KGQLTS SM VDDVFY I VKKC IGRAIiS SS S IDCIi 
CAM INLATTELES DFRDVI*CNKLRMGF PATT FQD IQRGVTSAVN 
IMHSSLQQGKFDTKGIESTDEAKMS FIiVTLNNVEVCSENISTLK 
KTLES DCTKLFS QG I GGEQAQ AKFDS CLSDLAAVSNKFRDLLQE 
GLTELNS TAI KPQVQPWI NS FFSVSHNI EEEE FNDYEANDP WVQ 
OFILNLEQQMAE FKASLS PVI YDSLTGLMTSLVAVELEKWLKS 
TFNRLGGLQFDKELRSLI AYLTTVTTWTIRDKFARLSQWATI LN 

LERVTEILDYWGPNSGPLTWRuTPAEVRQVLAIiRIDFRSEDIKR 
LRL 




3 


1837 


EAFSAGSRRRGLAU3RGVLGGLGGYCPCCCRRRGRLLVLLLLVR 
RGGEGGGGRGRGDKRRRRQARRQRRRPE PAEARGGKMADVLS VL 
RQYNIQKKE I WKGDEVI FGE FSWPKNVKTNYWWGTGKEGQPR 
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amino acid 
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amino acid 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F~ Phenylalanine, G=Glycine, 
H^Histidine, I=*Isoleucine , K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R-Arginine, 
S -Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=unlcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EYYTLDSILFLLNN^/HLSHPVYVRRAATEWIPWRRPDRKDLLG 
YLNGEASTS AS I DRSAPLE I GLQRSTQVKRAADEVIiAEAKKPR I 
EDEECVRLDKERIJVARLEGHKEGIVQTEQIRSLSEAMSVEKIAA 
I KAKI MAKKRSTI KTDLDDD I TALKQRS FVDAEVDVTRDIVSRE 
RVWRTRTT I LQSTGKN FSKNI FAI LQS VKAREEGRAPEQRPAPN 
AAP VDPTIiRTKQP I PAAYNRYDQERFKGKEETEGFKIDTMGTYH 
GMTLKSVTEGASARKTQTPAAQPVPRPVSQARPPPNQKKGSRTP 
III IPAATTSLITMLNAKDLLQDLKFVPSDEKKKQGCQRENETL 
I QRRKDQMQPGGTAI S VTVP YRWDQPLKLMPQDKDRWAVFVQ 
GPAWQFKG WPWL.LPDGS PVDI FAKIKAFHLKYDEVRLDPNVQKW 
DVTVLELSYHKRHLDRPVFIiRVWETLDRYMVKHKSHLRF 


6594 


1 


1096 


EFPGRRFRGSQAS PLCATCX? PALLRAPTRAAMTRSLFKGNFWS A 
DILSTIGYDNriQHLNNGRKNCKEFEDFLKERAAIEERYGKDLL 
NLSRKKPCGQSEINTLKRALEVFKQQVDNVAQCHIQIAQSLREE 
ARKMEEFREKQKLQRKKTELIMDAIHKQKSLQFKKTMDAKKNYE 
QKCRDKDEAEQAVS RS ANLVNPKQQEKLF VKLATS KTAVEDSDK 
AYMLHIGTLD KVREEWQS EHI KACEAFEAQECER IN F FRNALWIi 
H VWQLSQQCVTS DEM YEQVRKSLEMCS IQRDI EYFVNQRKTGQI 
PPAP IMYENFYSSQKNAVPAGKATGPNLARRGPLPI PKS S PDDP 
' NYSLVDDYSLI.YQ 


6595 


57 


781 


PLGTMSDSDLGEDEGLLSLAGKRKRRGNLPKESVKILRDWIjYLH 
RYNAYPSEQEKI^LSGQTNI^VLQICNWFINARRRLLPDMLJRKD 
GKDPNQFTISRRGGKASDVALPRGSSPSVIAVSVPAPTNVLSLS 
VCSMPLHSGQGEKPAAPFPRGEIiESPKPLVTPGSTLTIiIiTRAEA 
GSPTGGLFNTPP PTP PEQDKEDFS SFQLLVE VALQRAAEMELQK 
QQDPS LPLLHTP I PLVSENPQ 


6596 


2 


1026 


PRLPVRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKI FC IRI SDDIDDPKWTLCLQVMLPNEYPGTAP 
P I YQLMAPWI»KGQERADLSNSL»EE I YIQNIGES I L»YI»WVEKIRD 
VLIQKSQMTEPGPDVKKKTEEEDVECEDDHIACQPESSVKAbD 
FDISETRTEVEVEELPPIDHGIPITDRRSTFQAHLAPWCPKQV 
KMVLS KLYENKKIASATHNIYAYR I YCEDKQT FLQDCEDDGETA 
AGGRLLHLMEILNVKN\mVWSRWYGGII^PDRFKHINNCARN 
I LVEKNYTNS PEESSKALGKNKKVRKDKKRNEH 


6597 


2 


1026 


PRLPVRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKIFCIRISDDIDDPKWTUXQVMLPNEYPGTAP 
P I YQLNAPWLKGQE RADLS NSLEE I Y I QNIGES I LYLWVEK I RD 
VL I QKSQMTE PG PDVKKKTEEED VECEDDIi ILACQPESS VKALD 
FD I S ETRTEVEVEEIiPP IDHGI P ITDRRS TFQAHLAP VVCPKQV 
KMVLS KLYENKK I ASATHN I YAYR I YCEDKQTFLQDCEDDGETA 
AGGRLLHLME ILNVKNVMWVSRWYGGI LLGPDRFKHINNCARN 
ILVEKNYTNSPEESSKALGKNKKVRKDKKRNEH 


6598 


1099 


419 


PRVRWATTMAMSFEWPWQYRFPPFFTLQPNVDTRQKQLAAWCSL 
VLS FCRLH KQSS MT VMEAQES PLFNNVKLQRKLP VES I QI VLEE 
LR KKGNLE WLDKS KS SFL I MWRRPEE WGKL I YQWVS RSGQNNS V 
FTLYELTNGEDTEDEEFHGLDEATLLRALQALQQEHKAE I ITVS 
DGPRRQVL,tAGTCIiPIiLLTSHLSRAFKRRQTQCPPKTGSVTPPD 
SKGLQS 


6599 


164 


1593 


KMAALTTLFKYIDENQDRYIKKLAKWVAIQSVSAWPEKRGBIRR 
MMEVAAAD VKQLGGSVELVD IGKQKLPDGSEI PLPPI LLGRLGS 
DPQKKTVC I YGHLDVQPAALEDGWDSEP FTLVERDGKLHGRGST 
DDKGPVAG WINAI»EAYQKTGQE I PVNVRFCLEGMEESGSEGIiDE 
LI FARKDT FFECD VD YVCISDN Y WLGKKKP CITYGLRG I C YFF I E 
VECSNKDLHSGVYGGSVHEAMTDLILLMGSLVDKRGNILIPGIN 
EAVAAVTEEEHKLYDDIDFDIEEFAKDVGAQILLHSHKKDILMH 
RWRYPSLSLHGIEGAFSGSGAKTVIPRKWGKFSIRLVPNMTPE 
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ID 
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beginning 
nucleotide 
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amino acid 
residue of 

sequence 


Predicted end 
nucleotide 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D=Aspartic Acid, E=> 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
I>= Leucine, M~ Methionine, N=Asparagine , 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine , T^Threonine , V=Valine , 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








WGEQVTSYLTKKFAELRSPNEFKVYMGHGGKPWVSDFSHPHYL 
AGRRAMKTVFGVEPDLTREGGS I PVTLTFQEATGKNVMLLPVGS 
ADDGAHS QNEKLNRYNY I EGTKMLAAYLYBVSQLKD 


6600 


2 


934 


PGRLFRVAAMESAGLEQLLRELLLPDTERIRRATEQLQI VLRAP " 
AAI»SALCDLI*ASAADPQ IRQFAAVLTRRRLNTRWRRloAAEQRES 
LKSLILTALQRETEHCVSIiSIiAQLSATIFRKEGLEAWPQLIjQLI* 
QHSTHSPHSPEREMGLLLLSWVTSRPEAFQPHHRELUILIiNET 
I^EVGSPGbLFYSIJ^TLTTMAPYLSTEDVPIARMLVPKIiIMAMQ 
TLI P IDEAKACEALEALDELLESE VPVITP YLSEVLTFCLEVAR 
NVALGNAIR IR I LCCLTFLVKVKS KALLKNRLLATLAAHPFPHC 
GC 


6601 


529 


1420 


PRAAARAP P PAVLRK DRRAATAPG AGEMTLHG PLAQR YFLNH I E 
KI TTWQDPRKAMNQPLNHMNLH PAVSSTPVPQRSMAVSQPNLVM 
NHQHQQQMAPS TLSQQNHPTQNP P AGLMSMPNALTTOGjQQQQKL 
R I»QR I QMERERI RMRQEELMRQEAAIjCROL PMEAETIiAPVQAAV 
NP PTMTPDMRS ITNNSSDPFLNGGP YHSREQSTDSGLGLGCYSV 
PTTPEDFIiSNVDEMDTGENAG<yrPMNINPOX3TRFPDFLDCljPGT 
NVDLGTLESEDLI PLFNDVESALNKSEPFLTWL 


6602 


127 


617 


LI*DFPALPKFVLAQSPKAGKPSTMTSMTQSI*REVI KAKTKARNF 
ERVLGKITLVSAAPGKVICEMKVEEEHTNAIGTLHGGLTATLVD 
NI STMALLCTERGAPGVS VDMNI T YMS PAKLGED I VI TAHVLKQ 
GKTIiAFTSVDIiTNKATGKLIAQGRHTKHIiGN 


6 603 


79 


660 


PVGPSSLAARTGLGHLPFLHRlASSRGIiDMDLIiOFLAFTaFVLLL 
SGMGATGTLRTSLDPSLE I Y KKM FE VKRREQLLALKNLAQLND I 
HQQYK I LD VMIiKGLFKVLEDSRT VLTAADVLPDG P FPQDEKLKD 
AFSH VVENTAFFGDVVIiRFPRI VH Y YFDHNSlIVJNI»i*I RWG I S FC 
NQTGVFNQG PHS ? I LSLM 


6604 


3 


688 


TSTAQRQGGERMS FRGGGRGGFNRGGGGGGFNRGGS SNH FRGGG 
GGGGGGNFRGGGRGGFGRGGGRGGFNKGQDQGPPERWLLGEFli 
HPCEDDIVCKCTTDENKVPYFNAPVYLENKEQIGKVDEIFGQLR 
DFYFS VKLSENMKASS FKKLQKFY IDP YKLLPLQRFLPRP PGEK 

GPPRGGGRGGRGGGRGGGGRGGGRGGGFRGGRGGGGGGFRGGRG 
GGFRGRGH 


6605 


7 


B48 


SGSRRGAMRAAGVGLVDCHCHLSAPDFDRDLJ3DVX.EKAKKANVV 
ALVAVAEHSGEFEKIMQLSERYNGFVLPCLGVHPVQGrjPPEDQR 
SVTLKDliDVALPI IENYKDRL.LAIGEVGLDFSPRFAGTGEQKEE 
QRQVLt IRQIQLAKRhNhPVNVHS RSAG R PT I NIiLQEQGAEKVbL, 
HAFDGRPSVAMEGVRAGYFFS I PPSI IRSGQQKIiVKQLPLTS I C 
LETDS PALGPEKQVRNEPWNI S ISABYIAQVKGISVEE VIEVTT 
QNALKLFPKLRHIiLQK 


6606 




1682 


FVEIRPRAEVANLSAHSASPIQDAVLKRLSLLEBIVYRQLNGIiS"" 
KSLGLIEGYGGRGKGGIjPATLSPAEEEKAKGPHEKYGYNSYLSE 
KISLDRS I PDYRPTKCKELKYSKDIiPQISI IFIFVNEALSVILR 

SVHSAVNHTPTHT.T.K''PTTT»'\rn , nKrer»T7i7'C , T mmr t?c •v-imim-vnr^t 
«-f w niMiTiini tr AnuujVEiA -±U Vl/UI^ 9i/uJSbijJvv IrJuaCtX VriltRYPGXi 

VKWRNQKREGL IRARI EGWKVATGQVTGFFDAHVE FTAGWAE P 
VLSR I Q EN RKRVI LPS I DN I KQDNFEVQRYENSAHG YS WELWCM 
YISPPKDWWDAGDPSLPIRTPAMIGCS FWNRKFFGE IGLLDPG 
MDVYGGENIELGIKVWLCGGSMEVLPCSRVAHIERKKKPYNSNI 
GFYTKRNALRVAEVWMDDYKS HVY I AWNLPLENPG ID I GDVS ER 
RALRKSLKCKNFQWYLDHVYPEMRRYNNTVAYGELRTOKAKDVC 
LDQGPLENHTAI L Y PCHGWG PQLARYTKEGFIiHLGALGTTTLL P 
DTRCLVDNS KSR LPQLLDCD KVKS S L YKRWN FI QNGA IMNTCGTG 
RCIiEVENRGLAG I DLI LRSCTGQRWTI KNS I K 


6607 


137 


986 


VPACAGLKKEARSLLASPPRLLNTKLQASCRALFSPPIQSRQTT 
GISFCGRGGAGPGVPTRTQVFAAMGAVMGTFSSLQTKQRRPSKD 
KIEDELEMTMVCHRPEGLEQLEAQTNFTKREIjQVLYRGFKNECP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y° Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SG WNEDT FKQI YAQFFPHGDAST YAHY LPNAFDTTQTGS VKFE 
DFVTAIjS I LLRGTVHEKLRWTFNLYD INKDGY INQEEMMD I VKA 
IYDMT^GKYTYPVLKEDTPRQHVDVFFQKMDKNKDGIVTLiDEFLE 
SCQEDDNIMRSLQLFQWVM 


6608 


224 


1140 


RPCFSS PTGLCPRIjS Y PM I LU2HAVLPPPKQPS PS PPMS VATRS 
TGTLQLPPQKPFGQEASLPLAGEEELSKGGEODCALEELCKPLY 
CKI»CNVTIiNS AQQAQAHYQG KNHG KKLRNY YAANSCP P PARMSN 
WEPAATPWPVPPQMGS FKPGGRVI LATENDYCKLCDASFSSP 
AVAQAHYQGKNHAKRLRIAEAQSNS FSESSELGQRRARKEGNEF 
KMMPNRRl^YWQ^SGPYF^RSRQRIPRDIAMCVTFSGQFYC 
SMCNVGAGEEME FRQHLE S KQHKS KVS EQRYRNEMEULG Y V 


6609 


1 


443 


FRLRCRRFRVAGGRLAGAGLRESRVPAPEQRL»SALTtiLSWSAVT 
PAAEPGNFQIiSPAEPRGPIiASPVRAAPRAPCPAAEMSELNTKTS 
PATNQAAGQEE KGKAGNVKKAEEEE E I DI DLTAPETEKAALAI Q 
GKFRRFQKRKKDPSS 


6610 


319 


881 


GRKSLCNLHIFIRFPLTYPDMYMGMMCTAKKCGIRFQPPAII3UI 
YESE I KGKIRQR IMPVRNFSKFSDCTRAAEQLKNNPRHKS YI*EQ 
VSIiRQIjEKLFSFIiRGYliSGQStJVETMEQIQRETTIDPEEDliNKIj 
DDKEI»AKR KS I MDEL»FE KNQKKKDDPNFVYDI E VBFPQDDQLQS 
CGWDTESADEF 


6611 


978 


212 


PGCSGAGS R VW WliPALRHliAMGSTESS EGRR VS FG VDEEE R VRV 
LQGVRLSENWNRMKEPSSPPPAPTSSTFGLQDGNLRAPHKEST 
LPRSGSSGGQQPSGMKEGVKRYEQEHAAIQDKIiFQVAKREREAA 
TKHS KASLPTGEGS I SHEEQKS VRLARELESREAELRRRDTFYK 
EQLERIERKNAEMYKLSSEQFHEAASKMESriKPRRVEPVCSGIj 
QAQ IliHCYRDRPIIEVLLCSDLVKAYQRCVSAAHKG 


6612 


1724 


992 


VSTHASALSRTQGQPQRQPRAAASGAGAGTAGGGGSGGAEGSKM 
STEAQRVDDSPSTSGGSSDGDQRESVQOEPEREQVQPKKKEGKI 
S SKTAAKIjSTSAKR I QKEltAE I TJbD P P PNCSAG P KGDNI YEWRS 
TI DGPPGS VYEGGVFFLD ITFS PDYPFKPPKVTFRTRI YHCKIN 
SQGVI CLDILKDNWSPALTI SKVLLS I CS LLTDCNPADPL VGS I 
ATQYMTNRAEHDRMARQWTKRYAT 


6613 


130 


748 


ELELS SNM PEQSND YRVAVFGAGGVGKS SLtVLRFVKGTFRESYI 
PTVEDTYRQVISCDKSICTIjQITDTTGSHQFPAMQRLSISKGHA 
FI LVYS I TSRQS IjEELKP I YEQ ICE 1 KGDV3S I P IMLVGN KCDE 
S P SRE VQS SEAEAIiAR WKCAFMETS AKLNHNVKELFQELLNLE 
KRRTVS LQI DGKKS KQQKRKEKLKGKC V I M 


6614 


3 


1191 


S S AAE AMR VLVRRCWGPPLAHGARRGRPS PQ WRA1ARLGWBDCR 
DSRVREKPPWRVLFFGTDQFARBALRALHAARENKEEELIDKLE 
WTMPS PS PKGLPVKQYAVQS QLPVYE WPD VGSGE YDVGWAS F 
GRLLNEAL I LKF PYG IIJJVHPSCLPRWRGPAPVIHTVLHGDTVT 
GVTIMQIRPKRFDVGPILKQETVPVPPKSTAKELEAVLSRLGAN 

EQ I FRLYRAIGN 1 1 PLQTLWMANTI KLLDIiVEVNS S VLADP KI*T 
GQALI PGS VI YHKQSQILLVYCKDGW IGVRS VMI.KKSLTATDFY 
NG YLH PW YQKNSQAQ PSQCRFQTLRLPTKKKQKKTVAMQQCI E 


6615 


832 


35 


GRVGAGASAMSELPGDVRAFLREHPSIiRLQTDARKVRCIIjTGHE 
LPCRL PELQVYTRGKKYQRLVRAS PAFDYAE FEPH I VPSTKNPH 
QLFCKI.TLRHINKCPEHVLRHTQGRRYQRALCKYEECQKQGVEY 
VPACLVHRRRRREDQMDGDGPRPREAFWEPTSSDEGGAASDDSM 
TDLYP PEL FTRKDLGSTEDGDGTDD FLTDKEDEKAKPPREKATD 
EGRRETT V YRGLVQ KRGKKQLGSLKKKFKSHHRKPKS FS S CKQS 
G 


6616 


347 


1886 


LLPPCQGARPLSSPPHASEDNLFLFWNC I LCAFPHPS PQPLQYP 
VWPLI*LVITQI PAPRHLRNRPFS FSRGGLDSFSGSLSTPSI CRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=:Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N^Asparagine » 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
n-itypcopnan, i = iyrosme, X=UnKnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PAW VKMAPV/PPKGLVPAVLWGLSbFLNJuPGP I WLQPS PPPQSS P 
P PQPH P CHTCRG LVDS FN KGL ERT IRDN FGGGNTAW E E ENLS K Y 
aajz>c i Kiiva VJjt.CjVCSKSDFECHKLjjELSEEI>VESWWFHKQQEA 
PDLFQWLCSDSLKLCCPAGTFGPSCLPCPGGTERPCGGYGQCEG 
EGTRGGSGHCDCQAG YGGEACG QCGLG Y FEAERNAS HLVCS ACF 
GPCARCSGPEESNCLQCKKGWALHHLKCVDIDECGTEGANCGAD 
<- vn rtG5 lfc.CRDCAKACLGCMGAGPGRCKKCSPGYQQVGSKC 
LDVDECETEVCPGEN KQCENTEGG YRCI CAEG YKQMEG ICVKEQ 
IPESAGFFSEMTEDELWLQQMFFGIIICALATLAAKGDLVFTA 
I FI G AVAAMTG Y WLS ERSDRVLEGFI KGR 


6617 


118 


673 


VWMAWQVSLLELEDRLQCP I CLEVFKESLMLQCGHS YCKGCLVS 
I>S YHLDTKVRCPMCWQAVDGS S SLPNVSLAWIEALRIiPGDPEP 
KVCVHHRNPliSLFCEKDQEIil CGI*CGI»LGSHQHHPVTPISTVCS 

RMKEEIiAALFSEI»KQEQKKVDELIAKLVKNRTRlDGSAPSI*CPC 
LGPATFTFIi 


6618 


S48 


136 


DGKVARRAPNS PAFQNDI YPLVSAPRATTAESPWSKVLQNTQCR 
NVPKMTSERSRI PCLSAAAAEGTGKKQQEGRAMATLDRKVPS PE 
AFLGKPWSSWIDAAKLHCSDNVDLEEAGKEGGKSREVMRLNKEA 
WKYGT 


6619 


246 


842 


PAS S E VI*TAAVMFI*LIiNCI VAVSONMG I GKNGDIj PR PPLRNEFR 
YFQRMTTTS SVEGKQNLVIMGRKTWFS I PEKNRPLKDR1N Jb VhS 
RELKEPPQGAHFUUlSLDDALKLTERPEIJuNFKyDMIWIVGGSSV 
YKEAMNHLGHLKLFVTRIMQDFESI>TFFSEIDLEKYKLLPEYPG 
I LSDVQEGKH I KYKFE VCEKDD 


6620 


3 


1879 


NSRVDDFVARARMAAENEASQESALGAYS P VDYMS ITS FPRLPE 
DEPAPAAPLRGRKDEDAFLGDPDTDPDSFLKSARLQRLPSSSSE 
MGS QDGS PLRE TRKD P FS AAAAECSCRQDGLTVI VTACliTFATG 
VTVALVMQ I YFGDPQI FQQGAWTDAARCTSLGI EVLS KQGSSV 
DAAVAAALCLGIVAPHSSGLGGGGVMLVHDIRRNESHLIDFRES 
APGALREETLQRS WETKPGLL VGVPGMVKGLHEAHQLYGRI* PWS 
QVIiAFAAAVAQE^FNVTHDLARAIjAEQLPPNMSERFRETFIiPSG 
R P PL PGSItLHR PDtiAEVLDVLGTSG PAAFYAGGNLTI*EMVAEAQ 
HAGG VI TEEDFSNYSAIiVEKPVCG V YRGHLVLS PPPPHTGPALI 
S ALiN I L EG FNLTS L.VS RE QA3^ WVAETLK I ALALASRLGD PVYD 
S T I TESMDDMLS KVEAAYLRGH INDSQAAPAPIjIjPVYEIjDGAPT 

aaqvblmgpddfivamvsslnqpfgsglitpsgillnsqmiidfs 
wpnrtanhsapslensvqpgkrplsfllptvvrpaeglcgtyla 
lgangaarglsgltq vrftpwlaffsre ps cgldcrcls ylwi.v 

SIPHAANMG 


6621 


1 


662 


VQGITSYQQRLQALRKEKSRDAARSRRGKENFEFYEIiAKLLiPLP 
AAI TSQLDKAS I IRLT ISYLKMRDFANQGDPPWNLRMEGPPPNT 
SVKVIGAQRRRS PSALAI EVFEAHLGSH ILQSIiDGYVFAIjNQEG 
KFL Y I SETVS I YLGLSQVELTGSS VFD YVHPGDHVEMAEQLGMK 
LPPGRGLLSQGTAEDGASSASSSSQSETPEPWCFPPASDQFKL 


6622 


2 


319 


CJKASGAQE ETEAGGPERARAMEANM PKRKEPGRS LR I KVI SMGN 
AE VGKS CI I KRYCEKRFVSKYLATIGIDYGVTKVHVRDRE I KVN 
I FDMAGHPFFYEVRKPF 


6623 


1886 


189 


KALFEKVKKFRLHVEEGDIIiYAMYVRQTVLKVIKFLI I IAYNSA ' 
LVS KVQFT VDCNVDIQDMTGYKNFSCNHTMAHLFSKLS FCYLCF 
VSIYGLTCLYTLYWLFYRSLRBYSFEYVRQETGFDDIPDVKNDF 
AFMI^MIDQYDPLYSKRFAVFLSEVSENKLKQLNIiNNEWTPDKI* 
RQKLQTNAHNRI1ELPLIMLSGI.PDTVFEITELQSLKLEI IKNVM 
I PAT I AQLDNLQELS LHQCS VK I HS AAL S FJuKENLKVLSVKFDD 
MRELPPWMYGLRNLEELYLVGSLSHDISRNVTLESLRDLKSLKI 
LS I KSNVS KI PQAVVDVS SHLQKMCIHNDGT KLVMLNNLiKKMTN 
LTELELVHCDLERIPHAVFSLIiSIiQELDLKENNLKSIEEIVSFQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D*=Aspartic Acid, E=» 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=bysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine / T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








HLRKLTVLKLWHNSITYIPEHlKJOiTSLERLSFSHNKIEVLPSH 
LFLCNKIRYLDLSYKDIRFIPPEIGVLQSLQYFSITCNKVESLP 
DELYFCKKLKTLKIGKNSLSVLSPiaGNLLFLSYLDGKGNHFEI 
L PPELGD CRALKRAGLWEDAL FETL PS D VREQMKTE 


6624 


218 


1786 


GS RRGGGSRI PAVSTH VAPGRS VXiRPFASGALiRIiRSLVKALGGC 
RGRPSGLAHLSQETSHWRAKRSGRACLGDFPGEILRSFIMKCTA 
REWLRVTTVLFMARAIPAMVVPNATLIjEKLIjEKYMDEDGEWWIA 
KQRGKRAITDNDMQSILDLHNKLRSQVYPTASNMEYMTWDVEIiE 
RSAESWAES CLWEHGPASIJjPS T C50NT.fiAHWfH? yt? PUfPui/ncH 

YDEVKDFSYPYEHECNPYCPFRCSGPVCTHYTQWWATSNRIGC 
AINLCHNMNIWGQIWPKAVYLVCNYSPKGNWWGHAPYKHGRPCS 
ACPPSFGGGCRENLCYKEGSDRYYPPREEETNEIERQQSQVHDT 
HVRTRSDDSSRNEVI S AQQMSQ I VSCEVRLRDQCKGTTCNRYEC 
PAG CLDS KAKV I GS VH YEMQSS 1 CRAAIHYGI IDNDGGWVDITR 
QGRKH Y F I KSNRNG I QTIGKYQS ANS FTVS KVT VQ A VTCETTVE 
QLCPFHKPASHCPRVYCPRKLYASKSTLCSCNWNSSLF 


6625 


1124 


543 


PGPRGGGGSLLSTKALGRSRGLGMHPGPSSGGTEGGVPTALRPP 
GPLVPSTSDDWIiLKNI ELFT)KLALRFHGRLLFLKDVLGDEI CCW 
SFYGOXSRKlAEVCCTSIVYATEKKQTKVEFPEARIFEETIiNIIiI 
YETPRGPDPALLEATGGAAGAGGAGRGEDEENREHRVRRIHVRR 


6626 


3 


1498 


SAVEFVYTDRFHblLG IS VEFI>CSLRSDATMES I TACLHALQAli 
LDVPWPRSKIGSDQDSGI EliLNVLHRVI LTRESPS IQLASLEW 
RQ 1 1 CAAQEHVKEKRRSAEVDDGAAEKETI.PEFGEGKDTGGLVP 
GKS LVFATLELCVCILVRQLPEI*NPKLTGSPGVKATKPQI LLED 
GSRLVS AALVTLS EL PAVCSPEGS I S IL PTIL YL TIG VLRETAV 

TTI LDCWDP VT>ETHQELDEVSLLTAITVFILSTSPEVTTI PCLQ 
KRC IDKFKATLEI KDPWQI KTYQIiLHS I FQYPNPAVS YP YI YS 
LAS CIMEKLQE IDKRKPENTAEIiE I FQEGIKVLETLVTVAEEHH 
RAQLVACLLP I LI SFLLDENSLGS ATS IMRNLHDFALQNLMQIG 
PQ YSS VFKS LVAS S PALKARLEAA I KGNQESVKVKI PTS KYTKS 
PGKNSSIQLKTSFL 


6627 


1 


697 


GIPHLSSRDMTGTPGAVATRDGEAPERSPPCSPSYDLTGKVMLL " 
GDTGVGKTCFLIQFKIX3AFLSGTFIATVGIDFRNKVVTVDGVRV 
KLQIWDTAGQERFRSVTHAYYRDAQALLLLYDITNKSSFDNIRA 
WLTE I HE YAQRD WIMLLGNKADMS S ERVIRS EDGETLARE YGV 
PFLETSAKTGMNVELAFLAI AKELK YRAGHQADE PS FQ I RD YVE 
SQKKRSSCCSFM 


662S 


1 


1861 


QCAEFGGGSGGGGGSGGGGSGGGRGAGGEENKENERPSAGSKAN 
KEFGDSLS LE I LQ 1 1 KES QQQHGLRHGDFQRYRG YCSRRQRRLR. 
KTLNFKMGNRHKFTGKKVTEELLTDNR YLLL VLMDAERAWS YAM 
QLKQEANTEPRKRFHLLSRLRKAVKHAEELERLCESNRVDAKTK 
IjE AQAYTAYLSGMLRFEHQEWKAAIE AFNKCKT I YEKLASAFTE 
EQAVLYNQRVEE IS PNIRYCAYNIGDQSAINELMQMRLRSGGTE 
GL LAEKLEAIi I TQTRAKQAATMSE VEWRGRTVPVKIDKVR I FLI* 
GLADNEAAIVQAESEETKERLFESMLSECRDAIQWREELKPDQ 
KQRDY I LEGEPG KVSNLQ YLHS YLTYI KLSTAI KRNENMAKGLQ 
RALLQQQPEX)DSKRSPRPQDLIRLYDIILQNLVELLQLPGLEED 
KAFQKE IGLKTL VFKA YRC F FI AQS YVLVKKWSE ALVLYDRVLK 
YANEVNSDAGAFKNSLKDLPDVQEIj I TQVRSE KCSLQAAAI LDA 
NDAHQTETSSSQVKDNKPLVERFETFCLDPSLVTKQANLVHFPP 
GFQPI PCKPLFFI)LAI,NHVAFPPLEDKLEQKTKSGLTGYIKG I F 
GFRS 


6629 


5653 


4549 


GATPLGSVGGRTGKMDAATLT YDTLRFAEFEDFPETSE PVW I LG 
RKYS I FTEKDE ILSDVASRLWFTYRKNFPAIGGTGPTSDTGWGC 
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i SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V^Valine, 
W=Tryptophan, Y= Tyrosine, X-Unknown, * = Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MLRCGQM I FAQADVCRHLGRDWRWTQRKRQPDS YFS VLNAFIDR 
KDS Y YS IHQ I AQMGVGEG KS I GQW YG PNTVAQ VL KK LAV FDTWS 
SLAVHIAMDWTWMEEI RRLCRTS VPCAGATAFPADSDRHCNGF 
PAGAEVTNRPSPWRPLVLLIPLRLGLTDINEAYVETLKHCFMMP 
Q S LG VI GG KPNS AHY F I G YVGEEL I Y/LDPHTTQPAVE PTDGCFI 
PDESFHCQHPPC-^SlAEIiDPSIAVVRGGHLSTQAFGAECCLiGM 
TRKTFGFLRFFFSMLG 


6 630 


2 


423 


LVQCGG IRRRS AWGAMPGRHVSR VRAIiYKRVLQLHRVIiP PDLKS 
LGDQYVKDE FR RH KTVGSDE AQRFLQ KWEV Y ATALLQQANENRQ 
NSTGKACFGTFLPEEKLNDFRDEQIGQLQELMQEATKPNRQFSI 
SESMKPKF 


6631 


2 


423 


LVQCGGIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS 
LGDQYVKDEFRRHKTVGSDEAQRFLQEWEVYATALLQQANENRQ 
NS TGKACFGT FL? EEKLNDFRDEQ IGQLQELMQEATKPNRQFS I 
SESMKPKF 


6632 


1273 


588 


WNSRGRTQRGAAPLAPAAAMKAWQRVTRASVTVGGEQISAIGR 
GICVLLGISLEDTQKELEHMVRKILNLRVFEDESGKHWSKSVMD 
KQYE I LCVSQFTIiQCVLKGNKPDFHLAMPTEQAEGFYNS FLEQL 
RKTYRPELIKDGKFGAYMQVHIQNDG PVTIELES PAPGTATS DP 
KQLSKLEKQQQRKEKTRAKGPSESS KERNTPRKEDRSAS SGAEG 
DVSSEREP 


6633 


1145 • 


617 


ATGRHEG VPTLEG I IQQLVNG IITPATI PSLGPWGVLHSNPMD Y 
AWGANGLDAI I TQLLNQFENTGPPPADKEKIQALPTVPVTEEHV 
GSGLECPVCKDDYALGERVRQLPCNHLFHDGCIVPWLEQHDSCP 
VCRKSLTGQNTATNPPGLTGVSFSS S SSSSSSS SPSNENATSNS 


6634 


1 


1134 


CGG I PRKGSG PRRR LPMARLRDCLPR LM LTLRS LLFWSLVYC Y C 
GLiCAS I HLLKLLWS LGKG PAQTFRR PAREHP PACL SDPSLG THC 
YVRIKDSGLRFHYVAAGERGKPLMLkLHGFPEFWYSWRYQLREF 
KS E YR WAliDLRG YGETDAP I HRQNY KLDCL I TD I KD IL.DS IjG Y 
SKCVL IGHDWGGM IAWLI A I CYFEM VMKLI VINFPHPNVFTEYI 
TiRHPAQLLKSSYYYFFQI PWFPEFMFS INDFKVLKHLFTSHSTG 
IGRKGCQLTTEDLEAYIYVFSQPGALSGPINHYRNIFSCLPLKH 
HMVTTPTLLLWG ENDAFMEVEMAEVTRF YVKNYFRIjTILS EASH 
WLQQDQPDIVNKLIWTFLKEETRKKD 


6635 • 


1420 


470 


EMRAGQQLASMIiRWTRAWRLPREGLGPHGPSFARVPVAPSSSSG 
GRGGAEP R PL PLS YRLLiDGE AALPAWFLH5DFGS KTNFNS IAK 
ILAQQTGRRVLTVDARNHGDSPHSPDMSYEIMSQDLQDLLPQLG 
LVPCVWGHSMGGKTAWLLAUJRPELVERLIAVDISPVESTGVS 
HFATYVAAMRAINIADEZiPRSRARKIiADEOLSSVIQDMAVRQHr, 
LTNLVEVDGRFWRVNU1AI*TQHLDKILAFP0RQESYLGPTLFI, 
LGGNSQFVHPSHHPEIMRLFPRAQMQTVPNAGHWIHADRPQDFI 
AAIRGFLV 


6636 


1514 


1801 


SFCMFSHKQDSHFQAVPVQEKKKRLRRAPWRAFAQPQRLKHPAE 
QPIVRQCI^QRPPLCGVLGPVQQQLPPSLGPVLSPHSDPGWCRVD 
DGGDGVF 


6637 


2 


1501 


CSSSPCFHDCTCVLDKAGSYKCACLAGYTGQRCENLLEAGKSKI 
KASEDSLSVLEERNCSDPGGPVNGYQKITGGPGLINGRIIAKIGT 
WS FFCNN S YVL SGNEKR TCQQNGE WSGKQPICI KACREPKJ SD 
LVRRRVLPMQVQSRETPLHQLYSAAFSKQKLQSAPTKKPALPFG 
DLPMGYQHLHTQLQYECISPFYRRLGSSRRTCLRTGKWSGRAPS 
CIP ICGKI ENITAPKTQGLRWPWQAAI YRRTSG VHDGSLHKGAW 
FLVCSGALVNERT WVAAHCVTDLGKVTMI KTADLKWLGKFYR 
DDDRDEKTI QS LQ I SAI ILHPNYDP I LLDAD IAI L KLLDKAR I S 
TRVQP I CliAASRDLSTS FQESH I TVAG WNVLADVRS PG FKNDTL 
RS GWS WDSLLCE EQHEDHG I P VS VTDNMFCAS WE PTAPSDI C 
TAETGG I AAVS FPGRASPEPRWHLMGIiVSWS YDKTCS HRLSTAF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine , M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 

Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TKVLPFKDWIERNMK 


6638 


1391 


224 


GGIPQAGGKMAAPWWRAALCECRRWRGFSTSAVLGRRTPPLGPM 

PNSDIDLSNLERLEKYRSFDRYRRRAEQEAQAPHWWRTYREYFG 

T?T/T'nnvc , VTnTPT nran VWCDTTVNT .T .TTTC? VOS T OP!T ,"Q ZXW\7Tv"RT^R AA 
fc.K.1 jJir^CjixU.JLJJ.t?ijrxriri\VoK. L yyjjljlilvtvy^^S*"^ 1 *-"** vxsDJUWTn 

RLRTAS VPLDAVRAEWERTCGPYH KQRbAEYY G LYRDLFHGATF 
VPRVPLHVAYAVGEDDLMPVYCGNEVT PTEAAQAP EVTYEAEEG 
SLWTLLLTSLDGHLLE PDAE YTjHWLLTNI PGNRVAEGQVTC P YL 
PPFPARGSGIHRU^FLLFKODQPIBFSEDARPSPCYQliAORTFR 
TFDFYKKHQETMTPAGLSFFQCRWDDSVTYIFHQLLDMREPVFE 
FVRPPPYHPKQKRFPHRQPLRYLDRYRDSHEPTYGIY 


6639 


2046 


1268 


IGCF I MDGGDDGNLI I KKRF VSEAELDERRKRRQEEWE KVRKPE 
DPEECPEEVYDPRSLYERLQEQKDRKQQEYEEQFKFKNMVRGLD 
EDETNFLDEVSRQQELIEKQRREBEIiKEIiKEYRNNLKKVGISQE 
NKKEVEKKLTVKP I ETKNKFSQAKLIiAGAVKHKS SESGNS VKRL 
KPDPEPDDKNQEPSSCKSLGNTSIiSGPS IHCPS AAVCIG I bPGL 
GAYSGSSDSESSSDSEGTINATGKIVSSIFRTNTFLEAP 


6640 


117 


1043 


VLEPPDVSMAESEDRSLRIVbVGKTGSGKSATANTlIiGEEIFDS 
RIAAQAVTKNCQKASREWQGRDLLWDTPGLFDTKESI>DTTCKE 
I SR C 1 I SSCPGPHAI VLVLLLGRYTE EEQKT VAI»I KAVFG KSAM 
KHMVIbFTRKEELEGQSFHDFIADADVGLKSIVKECGNRCCAFS 
MS KKTS KAEKESQVQELtVE L IEKMVQ CNEGAY FSDD I YKDTEER 
LKQREEVLRKI YTDQIiNEE I KbVEEDKHKSEEKKEKEI KLLKLK 
YDEKI KNIREEAERNIFKDVFNRI WKMbSEIWHRFbS KCKFYSS 


6641 


1 


894 


SAAVGRRSEVRGCAPRPRIiRRSARRMDPVPGTDSAPIoAGLAWSS 
ASAPPPRGFSAI SCTVEGAPASFGKS FAQKSG Y FLCLS SLGSLE 
NPQENW7VDIQIWDKSPLPLGFSPVCDPMDSKASVSKKKRMCV 
KLbPLGATDTAVFDVRLSGKTKTVPGYLRIGDMGGFAIWCKKAK 
APRPVPKPRGLSRDMQGLSLDAASQPSKGGLLERTASRLGSRAS 
TLRRNDS I YEASSLiYG 1 SAMDGVPFTLHPRFEGKSCS PliAFSAF 
GDI.TI KSIiAD I EEEYNYGFWEKTAAARbPPS VS 


6642 


22 


1296 


PLEERMMTKMDPNDQAQRDI I FELRRIAFDAESDPSNAPGSGTE 
KRKAMYTKDYKMLGFTNHINPAMDFTQT^ 

HQDTYI RIVbENSSREDKHECPFGRS AI ELTKMbCEILQVGELP 
NEGRNDYHPMFFTHDRAFEISIjb G I C X yj-iiUNis. i wr^r*s*ii<j\iJ\r*LJi: a 
KVMQWREQ I TRAIiPSKPNSLDQFKS KLRSLS YS E I LRIiRQS ER 
MSQDDFQSPPIVELREKIQPEILELIKQQKLNRbCEGSSFRKIG 
NRRRQERFWYCRLALNHKVLHYGDLDDNPQGEVTFESLQEKIPV 
ADIKAIVTGKDCPHMKEKSAbKQNKEVL.ELAFS ILYDPDETLNF 
IAPNKYEYCIWIDGLSALI^KDMSSELTKSDLDTbLSMEMKLRL 
LDLENIQIPEAPPPIPKEPSSYDFVYHYG 


6643 


3049 


2265 


SLHAPAEGRTRGRIAEKPKMbTRKIKbWDINAHITCRiCSGYbl 
DATTVTECLHTFCRSCLVKYIJ3ENNTCPTCRIVIHQSHPLQYIG 
HDRTMQDI VYKLVPGLQEAEMRKQREFYHKLGMEVPGDI KGETC 
SAKQHLDSHRNGETKADDSSNKEAAEEKPEEDNDYHRSDEQVSI 
CLE CNSSKIjRGLKRKW IRCS AQATVLHLKKF I AKKliNIiS S FN E L 
DILCNEEIl^KDHTLKFVVVTRWRFKKAPLLLHYRPKMDLL 


6644 


1489 


290 


" FRPlATEPRGSSPVQLVSSTMSVRTLPbLFLNLGGEMIiYILDQR 
LRAQN I PGDKARKVLND 1 1 STMFNRKFMEELFKPQELYSKKALR 
TVYERXiAHAS IMKLNQASMDKLYDLMTMAFKYQVbbCPRPKDVL 
LVTFNHbDT I KGFIRDS PTI LQQVDETbRQbTE I YGGbS AGEFQ 
LIRQTLbl FFQDLHI RVSMFLKDKVQNNNGRFVbPVSGPVPWGT 
EVPGb I RMFNWKGEE VKRI E FKHGGNYVPAPKEGS FEF YGDRVIj 
KLGTNM YSVNQ P VETHVSGS S KN1ASWTQES IAPNPLAKEELNF 

larlmggmeikkpsgpepgfrlnlfttdeeeeqaaltrpeelsy 
eviniqatqdqqrseelarimgefeiteqprlstskgddllamm 

DEL 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A=Alanine, C=Cysteine / D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K- Lysine , 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6645 


6S30 


4646 


FVEGLAGYVYKAASEGKVLTLAALLLNRSESDIRYLLGYVSQQG 
GQRSTPL1IAARNGHAKVVRLLLEHYRVQTQQTGTVRFDGYVID 
GATALWCAAGAGH FE VVKLL VSHGANVNHTT VTNST PLRAACFD 
GRIiDI VKYLVENNAN I S I ANKYDNTCLMI AAYKGHTD WRYL-LE 
QRADPNAKAHCGATALHFAAEAGHIDIVKELI KWRAAI WNGHG 
MTPL KVAAESCKAD WELLLSHADCDRRSRI EALELLGAS FAND 
RENYDIIKTYHYLYLAMLERFQDGDNILEKEVLPPIHAYGNRTE 
CRNPQELES IRQDRDALHMEG1.I VRERILGADNIDVSHPI I YRG 
AVYADNME FEQCI KLWbHALHLRQKGNRNTHKDIiLRFAQVFSQM 
IHIiNETVKAPDIECVLRCSVLEIEQSMNRVKNISDADVHNAMDN 
YECNLYTFLYLVCISTKTQCSEEDQCKINKQIYNLIHLDPRTRE 
GFTLLHLAVNSNTPVDDFHTNDVCS FPNALVTKLLLDCGAEVNA 
VDNEGNSALHI IVQYNRPI SDFLTLHS II ISLVEAGAHTDMTNK 

QNKTPLDKSTTGVSEILLKTQMKMSLKCLAARAVRANDINYQDQ 
IPRTLEEFVGFH 


6646 


176 


890 


PS SRMNHLP EDMENALTGSQS SHASLRNI HS I NPTQLMAR I ES Y 
EGRE KKG I S DVRRTFCLF VTFDLLFVTLLW 1 1 ELNVNGG I ENTL 
EKEVMQYDYYSSYFDIFI1I1AVFRFKVLILAYAVCRLRHWWAIAL 
TTAVTSAFLUVKVTLSKIjPSOnAFnYVT.PT t<;t?tt imtptwdt t> 
FKVLPQEAE EENRLL I VQDAS ERAAL I PGG L S DG QF YS P PE S EA 
GSEEAEEKQDSEKPLLEL 


6647 


176 


890 


PSSRMNHLPEDMENALTGSQSSHASIjRNIHSINPTQLMARIESY 
EGREKKG I SDVRRTFCLFVTFDI^IiFVTT 1T1WT T P*T .Krirurwi t rtjtt 

EKEVMQYDYYSSYFDIFLLAVFRFKVLILAYAVCRLRHWWAIAL 
TTAVTSAFLLAKVILS KLFSQGAFG YVLPI IS FILAWIETWFLD 
FKVLPQEAEEENRI,I,IVQDASERAALIPGGLSDGQFYSPPESEA 
GSEEAEEKQDS EKPLLEL 


6648 


413 


897 


RNCWNCFTK Y FNS PPED I DHKDS YL I TRS I MAEPD Y I EDDNPEL " 
IRPQ KLI NPVKTSRNHQDDHRELIJWQKRGIiAPQNKPELQKVME 
KRKRDQVIKQKEEEAQKKKSDLEIELLKRQQKLEQLELEKQKLQ 
EEQENAPEFVKVKGNLRRTGQEVAQAQES j 


6649 


1357 


832 


W I PRAAGIRHE VKWDVKE I MSQHNI YVDALIjKEFEQFNRRLWEV 
SKRVRIPLPVSNILWEHCIRUUTOTIVEGYANVKKCSN^GRAIiM 
QIJ5FQQFLMKLEKLTDIRPIPDKEFVETYIKAYYLTENDMERWI 
KEHREYSTKQLTNLVNVCLGSHINKKARQKLIiAAIDDIDRPKR 


6650 


32 


765 


LVPLVFSLLVQS CKQVYRS I AMKFVP CIiLLVTLS CLGTLGQAPR 
QKQG S TGEE FH FQTGGRDS CTMRPSSIX3QGAGE WLRVDCRNTD 
QTYWCEYRGQPS MCQAFAADPKSYWNQALQEIiRRLHHACQGAP V 
LRPS VCR EAGPQAHMQQ VTSSLKGS P EPNQQPEAGTPSI»R PKAT 
VKLTEATQLGKDSMEELGKAKPTTRPTAKPTQPGPRPGGNEEAK 
KKAWEHCWKP FQALCAFL I S FFRG 


66S1 


3425 


1353 


AKELL K VGDFSI#CAGP YO^JTADTMENliSKEPIiAS FVSES FDISA 
CG I ATEHVKIDNSGEGLTAEAGSETLS RDGEVGVNS DMHYELSG 
DSDLDLLGDCRNPRIjDLEDS YTLRGS YTRKKDVPTDG YES S LNF 
HNNNQEDWGCS S WVPGMETS LPPGHWTAAVKKEEKCVP P YVQ I R 
DIiHG I LRTYANFS ITKELKDTMRTSHGLRRHPS FSANCGLPSSW 
TSTWQVADDLTQNTLDLE YLR FAHKLKQTI KNGDS QHSAS SAW 
FPKESPTQISIGAFPSTKISEAPFLHPAPRSRSPLLVTWESDP 
RPQGQPRRGYTASSLDSSSSWRERCSHNRDLRNSQRNHTVSFHIi 
NKLKYNSTVKESR1TOISLILNEYAEFNKVMKNSNQFIFQDKELN 
DVSGEATAQEMYLPFPGRSAS YEDI I IDVCTNLHVKLRS WKEA 
CKSTFLFYLVETEDKS FFVRTKNIiDRKGGHTE I E PQHFCQA FHR 
ENDTLI I 1 IRNEDISSHLHQ I PSLLKLKHFPS VI FAG VDS PGDV 
LDHTYQELFRAGGFVISDDKI LEAVTLVQLKEI IKI LEKLNGKG 
RWKWLLHYRENKKLKEDERVDSTAHKKNIMLKSFQSANI I ELLH 
YHQCDSRS S TKAE I LKCLLNIjQI QHIDARFAVLLTDKPT I PREV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryp t ophan , Y= Tyro sine, X=Unknov/n , *=Stop 
Codon, /=possible nucleotide deletion, 

\=DOSsible niicT i df* inQprtinnl 








FENNGI LVTDVNNF I ENIEKIAAPFRSS YW 


6652 


2 


1343 


IPGSTISCSCHSRRLRGGSPAPRLSLGAASPRPRPPSLPLPLPL 
PFPLFLPTRPAERAWIRSRRASEWVGKMEVPRIJDHALNSPTSPC 
EEVIKNLSLEAIQLCDRDGNKSQDSGIAEMEELPVPHNIKISNI 
TCDSFKISWEMDSKSKDRITHYFIDIjNKKENKNSNKFKHKDVPT 
KLVAKAVP L PMTVRGH WFLS PRTE YTVAVQTAS KQVDGDYWSE 
WSE I IEFCTADYSKVHLTQLLEKAEVIAGRMLKFSVFYRNQHKE 
YFD YVREHHGNAMQP S VKDNSGSHGS P I SGKLEG I FFS CSTEFN 
TGKPPQDS P YGR YR FE I AAE KLFNPNTNLYFGDFYCM YTAYHYV 
I L V XAP VGS PGDE FCKQRLPQLNS KDNKFLTCTEEDGVLV YHHA 
QD V I LEVI YTDPVDLS LGTVAE ITGHQLMS LSTANAKKD PSCKT 
CNISVGR 


6653 


170 


1910 


FFLEPRLRPFPASRARFVPARTRPSPLHPCCFCFEGGGSMLSPQ 
RVAAAASRGADDAMES SKPGPVQWLVQKDQHSFEIjDEKALAS I 
LLQDH I RDLD VVWS VAGAFRKGKS F I LDFMLR YLYS QKE SGHS 
NWLGDPEEPLTGFSWRGGSDPETTGIQIWSEVFTVEKPGGKKVA 
VVI.MJDTQGAFDSQSTVKDCATIFALSTMTSSVQIYNLSQNIQED 
DLQQLQLFTEYGRLAMDEIFQKPFQTLMFLVRDWSFPYEYSYGI. 
QGGMAFLDKRLQVKEHQHEEIQNVRNHIHSCFSDVTCFLLPHPG 
LQVATS PDFDGKLKD lAGEFKEQLQALI PYVXNPSKLMEKEING 
SKVTCRGLLEYFKAYIKIYQGEDIiPHPKSMLQATAEAYNIiAAAA 
SARD I YYNNMEEVCGGEKPYX.SPD IliEEKHCEFKQLALDHFKKT 
KKMGGKDFS FRYQQELEEEIKELYENFCKHNG S KNV FSTFRTPA 
VLFTG I VAL Y I ASGLTG FIGLE WAQIJ?NCMVGLLLI ALLTWG Y 
IRYSGQYRELGGAIDFGAAYVLEQASSHIGNSTQATVRDAWGR 
PSMDKKAQ 


6654 


1 


705 


RTS LS PSQCS S FNLAMASAGMQ I I»G WLTLLG W VNGLVS CAL»PM 
WKVTAFIGNS IWAQWWEGLWMSCWQSTGOMQCKVYDSLLAL 
PODLQAARALCVIALLVALFGLLVYLAGAKCTTCVEEKDSKARIi 
VLTSGIVFVISGVLTLIPVCWTAHAVIRDFYNPLVAEAQKRELG 
ASLYI^WAASGLLLI^GGIJ^CCTCPSGGSQGPSHYMARYSTSAP 
AISRGPSEYPTKNYV 


6655 


341 


16 


KDAYM FKKGLLALALVFS LP VFAAEHW IDVR VPEQ YQQEH VQGA 
INI PLKEVKBRI ATAVPDKNDTVKVYCNAGRQSGQAKE ILSEMG 
YTHVENAGG LKDI AMPKVKG 


6656 


2 


1212 


TELPPRPANLAIQPPLSPLRALAPLPEKPGAVPPPQKRMAKVAK 
DLNPG VKKMSLGQLQS ARG VACLGCKGTCSGFEPHS WRKI CKS C 
KCSQEDHCLTSDLEDDRKIGRLLMDS KYSTLTARVKGGDG I RI Y 
KRNRM IMTNP IATGKDPTFDTITYEV7APPGVTQKLGLQYMELIP 
KEKQ PVTGTEGAFYRRRQLMHQLP I YDQDPSRCRGLLENELKLM 
EEFVKQYKS EALGVGEYALPGQGGLPKEEGKQQEKPEGAETTAA 
TTNGSLSDPSKEVEYVCELCKGAAPPDSPWYSDRAGYNKQWHP 

IFAEDYQRVEDLAWHRKHFVCEGCEQLLSGRAYIVTKGQLLCPT 
CSKSKRS 


6657 


830 


2120 


LLTCQERAGDCLLSASTMKEWYWSPKKVADWLLENAMPEYCEP 
LEHFTGQDL INLTQEDFKKP PLCRVSS DNGQRLLDMI ETLKMEH 
HLEAHKNGHANGHLNIGVD I PTPDGS FS I KI KPNGMPNGYRKEM 
I KI PMPELERSQYPMEWGKTFLAFLYALSCFVLTTVMI SWHER 
VPPKEVQ PPLPDTFFDHFNR VQWAFS I CEINGM ILVGLWLI QWL 
LLKYKS I ISRRFFCI VGTLYLYRCITMYVTTLPVPGMHFNCSPK 
LFGDWEAQLRR IMKLIAGGGLS I TGS HNMCGDYL YSGHTVMLTL 
TYLFI KEYS PRRLWWYHWI CWLLS WG I FCI LLAHDH YTVD VW 
AYYITTRLFWYHTMANQQVLKEASQMNLLARVWWYRPFQYFEK 
NVQGIVPRSYHWPFPWPWHLSRQVKYSRLVNDT 


6658 


35 


855 


HCCALGAPGSPYRGLYFSSAAPCTAPRKAKHQSTLEGLTKRMLM 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine , T=Thr eonine , V=Val ine , 
W=Tryptophan, Y= Tyros ine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








FDPVPVKQEAMDPVSVSYPSNYMESMKPNKYGVIYSTPLPEKFF 

QTPEGLSHGIQMEPVDLTVNKRSSPPSAGNSPSSLKFPSSHRRA 

SPGLSMPSSSPPIKKYSPPSPGVQPFGVPLSMPPVMAAALSRHG 

IRSPGILPVIQPVWQPVPFMYTSHLQQPLMVSLSEEMENSSSS 

MQVPVIESYEKPISQKKIKIEPGIEPQRTDYYPEEMSPPLMNSV 
SPPQALLQE 


6659 

- 


18 


523 


KPURGDCETWFQNCSLPKFVCFFCHGFWLWRAHSMSNLHSLPGL " 
RGLTS I SRNQLQCTNAMRV1NNYQRR WKNQNTFLIiATFANWNV 
CGNPTI TCPHNRTLNNCHHSG VQ VPLM YCNLTTPS PQN I SNCR Y 
AQTPANMFYI VACDNRDQRRDPPQYP WPVHLHTI I 


6660 


514 


1707 


CAASLDCRHHLCEPDMKL,VWPSAKLiQAAAGASARACDSVTSNV 
LPLLLEQFHKHSQSSQRRTILEMtiLGFLKLQQKWSYEDKDQRPL 
NGFKDQI.CSLVFMALTDPSTQLQLVGIRTLTVIiGAQPDLLSYED 
LELAVGHLYRLSFLKEDSQSCRVAALEASGTLAALYPVAFSSHb 
VPKIAEELRVGESNLTKGDEPTQCSRHIjCCLQALSAVSTHPSIV 
KETLPLLIjQHLWQ VNRGNMVAQS SDV I AVCQSIjRQMAE KCQQD P 
ES CW YFHQTA I PCLLALAVQ ASMPE KE PS VLR KVLLEDE VLAAM 
VS VIGTATTHLS PELAAQSVTHI VPLFLDGNVSFLPENS FPSRF 

QPFQDGSSGQRRLI ALLMAFVCSLPRNVS EH I WEVLLFNLDKVT 
PG 


. 6661 


179 


430 


GVHAASGTLSATWIAEAKMFDSLAiCAGKYLGQAAKLMIGMPDYD 
NYVEHMRVNHPDQTPMTYEEFFRERQDARYGGKGGARCC 


6662 


185 


423 


RSLPKPAPAQPAS IHCARFSGVTP PTAKTAMSDGNTAFNALM YC 
GPKADDGNI FSACAPASSAVKASVSVAQPGQAVIP 


6663 


3 


1005 


RPVLSSRVDDFVPPLPETSGRRKiCLERMYS VDRVSDDI p I RTWF 
PKENLFS FQTASTTMQAISNFR KHLRMVGSR R VKAQTFAERRER 
SFSRSWSDPTPMKADTSHDSRDSSDLQSSHCTLDEAFEDLDWDT 
EKGLEAVACDTEGFVPPKVMLISS KVPKAEYI PTI IRRDDPSI I 
PILYDHEHATFED ILEE IERKLNVYHKGAKI WKMLIFCQGGPGH 
LYLLKNKVATFAKVEKEEDMIHFWKRLSRLMSKVNPEPNVIHIM 
GCYILGNPNGEKLFQNLRTLMTPYRVTFESPLELSAQGKQMIET 
YFDFRLYRLWKSRQHS KLLDFDDVL 


6664 
6665 


58 


968 


PPJLLRLPRSVVVMDSPWDEiALAFSRTSMFPFFDIAHYLVSVMA 
VKRQPGAAALAWKNPISSWFTAMLHCFGGGILSCLIiLAEPPIjKF 
LANHTNT LLAS S I WY I T FFCPHDLVSQGYS YLP VQLLASGMKE V 

trtwkivggvthansyykngwivmiaigwargaggtiitnferi, 
vkgdwkpegdewlkms ypakvtllgsvi ftfqhtqhlais khnl 

MPLYTIFIVATKITMMTTQTSTMTFAPFEDTLSWMLFGWQQPFS 
SCE KKS EAKS PSNGVGS LAS KP VDVASDNVKKKHTKKNB 




171 


1278 


derrlacrqwtqqrselypgfqkrqrflpkageeaaaqggrhl 

PGRWLGPGCTQNPCSVHTATGPEPRKLPLLPPDSPNSGYPKEPA 
ALCPGI PSPCRMTHQDLS ITAKLINGGVAGLVGVTCVFP IDLAK 
TRLQNQHGKAMYKGMIDCLMKTARAEGFFGMYRGAAVWLTLVTP 
EKAI KIiAANDFFRRI^EIXSMQRNIjKMEMLAGCGAGMCQVVVTC 
PMEMLKIQLQDAGRLAVHHQGSASAPSTSRSYTTGSASTHRRPS 

atliaweixrtqgjlaglyrglgatllrdipfs 1 1 yfplfanlnn 

lgfnelagkasfahsfvsgcvagsiaavavtpldvlktriqtlk 
kglgedmysgitdcar 


6666 


498 


2868 


MTTFLPVPQMMAGFSFGTFGNPPMESPSAWQTIHQPFlVSCLiTL 
WSPGCWPQPIQKEGVGLWDIRKPQSSLLRYGGMLSIiQSAMSVRF 
MSNGTQLLALRRRLPPVLYDXHSRLPVFQFDNQVYFNSCTMKSC 
CFAGDRDQ Y ILSGSDDFNLYMWR I PADPEAGGIGRWNGAFMVL 
KGHRS I VNQVRFNPHTYMICSSGVEKI IKIWS PYKQPGCTGDLD 
GRIEDDSRCLYTHEEYISLVLNSGSGLSHDYANQSVQEDPRMMA 
FFDSLVRREIEGWSSDSDSDIjSESTILQLHAGVSERSGYTDSES 
SASLPRSPPPTVDESADNAFHLGPIiRVTTTNTVASTPPTPTCED 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to fir3t 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor r espon d i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cycteine, D=Aspartic Acid, E= 
Glutamic Acid, F=* Phenyl alanine , G=Glycine, 
H=Hietidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine # N=Asparagine , 
P=Proline f Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Ys Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AASR0X2RLSALRRYQDKRLLALSNESDSEEKVCEVELDTDLFPR 
PRSPSPEDES SSSSS SSSSEDEEEIjNERRASTWQRNAMRRRQKT 
TREDKPS AP I KPTNT Y IGEDNYD YPQIKVDDLSSSPTS S PERS T 
S TLE I QPSRAS PTSD I ES VERK I YKAYKWLR YS YI S YSNNKDGE 
TSliVTGEADEGRAGTSHKDNPAPSS SKEACLNI AMAQRNQDLP P 
EGCSKDTFKEETPRTPSNGPGHEHSSHAWAEVPEGTSQDTGNSG 
SVEH?PETKKIiNTGKAI*SSRAEEPPSPPVPKASGSTLNSGSGNCP 
RTQSDDSEERSbETICANHNNGRLHPRPPHPHNNGQNIiGELEW 
AYS S PGHS DTDRDNSSLTGTLLHKDCCGSEMACETPNAGTRKDP 
TDTPATDSSRAVHGHSGLKRQR I ELEDTDS ENS S 55E KKLKT 


6667 


171 


1310 


AEEVERItAAMRSDSLVPGTHTPPIRRRSKFANLGRIFKPWKWRK 
KKSEKFKHT S AALERKI S MRQSREEIiI KRG VLKE I YDKDGELS I 
SNEEDSLENGQSLSSSQLSLPAIiSEMEPVPMPRDPCSYEVLQPS 
DIMDGPDPGAPVKLPCLPVKLSPPIiPPKKVMICMPVGGPDLSLV 
SYTAQKSGQQGVAQHHHTVLPSQIQHQLQYGSHGQHLPSTTGSIi 
PMHPSGCRMIDELNKTIAMTMQRLESSEQRVPCSTSYHSSGIiHS 
GDGVTKAGPMGLPEIRQVPTWIECDDNKENVPHESDYEDSSCL 
YTRE EEEEEBDEDDDS SI* YTSSIjAMKVCRKDS LAI KPS NRPS KR 
ELEEKNI LPRQTDEERLELRQQIGTKL 


6668 


714 


358 


TIAVATGPALTLRCHVCTS S SNCKHS WCPAS SRFCKTTNTVEP 
LRGNLVKKDCAE S CTPS YTLQGQVS SGTSSTQCCQEDL CNE KLH 
NAAPTRTALAHSALSLGLALSLLAVI LAPSLi 


6669 


4 59 


1207 


KDEETRKDYDYMLDHPEEYYSHYYHYYSRRLAPKVDVRVVILVS 
VCAI S VPQFFSWWNS YNKAI S YIjATVPKYRIQATE I AKQQGIiLK 
KAKE KGKNKKS KEE I RDEEEN 1 1 KN 1 1 KS KID I KGG YQKP Q I CD 
LIjLFQI I LAFFHLCS YIVWYCRW IYNFNIKGKEYGEEERLYI I R 
KSMKMSK3QFDSI*EDHQKETFLKRELWIKENYEVYKQEQEEELK 
KKLANDPRWKRYRRWMKNEGPGRLT FVDD 


6670 


184 


594 


VARI *GEAAKMS S EPPPP YPGGPTAPLLEEKSGAPPTPGRSS PA 
VMQP PPGMPIjPPAD I GPP P YE P PGHPMPQPGFI PPHMS ADGTYM 
PPGFYP P PGPHP PMG Y YP PGP YTPGP YPGPGGHTATVLVP SGAA 
TTVTV 


6671 


1 


763 


bPAEKPRSAPNMAGGRCXSPQLTALLAAWIAAVAATAGPEEAALP 
PEQSRVQ PMTASNWTLVMEGEWMLKF YAP WCPS CQQTDSEWEAF 
AKNGE I LQ I S VGKVDVI QE PGLSGRFFVTTIiPAFFHAKDG I FRR 
YRGPGI FEDIiQNYIIiEKKWQS VEPLTGWKS PASLTMSGMAGLFS 
ISGKIWHLHNYFTVTI«GIPAWCSYVFFVIATLVFGLSMDLVL*V 
ISQCNWDPPYRHVS*/RPSTNLGVHTAHTSEHLRL 


6672 


304 


1089 


APGSKPVQFMDFEGKTSFGMSVFNCiSNAIMGSGILGIiAYAKAHT 
GVI FFLALLLCIALLS SYS IHL.LLTCAGIAG I RAYEQLGQRAFG 
PAGKWVAT VI CLHNVGAMS S YLFI IKSELPLVIGTFLYMDPEG 
DWFLKGNLLI IIVSVLI I LPLALMKHLGYLGYTSGLSLTCMIiFF 
LVSVIYKKFQLGLCYRATMKQQWESEALVGTPQPRDSTAAVKAQ 
MFHS *IiTGVLTQWP I MAFAFVCHPGGAG PS I TELCRAFQAQD 


6673 


1116 


1963 


LQIQTHHTHHGARVTHLGSHQLLANAGTMLCRQQSSSMAPAFSQ 
S VTCGPSPCVRKQES ATKCLH I GACGSDLWARGWEQG* G * GLNV 
WI.CPCVAFHRGARPQAEEGGARWNSLVSSPWIPPNP*HSSIGAE 
NAVPRP*0/3*KVNPSGQERQS\WVLPLPVPGEPLKLPGLPG*NK 
SFSRV/SGSKGKWILPRQLM*AS*R\TPRFVPGTQWVPITW/PL 
ITWH*SAPTPPIiKACPAPRESDPCSSCLSCPCVTQKPRFSDTGW 
FGAGHCHSSCDFTRKGAAGGPG 


6674 


1 


440 


LEFDYMCQYDYVEVRDGDNRDGQI I KRVCGNERPAP IQS IGSSL 
HVLFHSECSKNFDGFHAI YEE I TACSSS PCFHDGTCVLDKAGS Y 
KCACLAGYTGQRCENI.LEERNCSDPG/WPSQWVPENNRGPWAYQ 
PTPC* IGTRVAFFLT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
P=Proline. Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=pos sible nucleotide deletion, 
\=possible nucleotide insertion) 


6675 


277 


1678 


GN WPTERMAFLDNPTI I LAHIRQSHVTSDDTGMCEMVI*! DHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQIKCKNIQWKERNSKQSAQELKSLFE 
KKSLKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTWnviASARVQDLIGLrcWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
GFSTLAIiVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKSFKVSMIHRLRFTTDVQL/GCALFPGVLRKRAAPVDC1.RPS 
ADT WRQEQ I GCCGAACAALRS * DS H KC* EC- 1 SGDKVEIDPVTNQ 
KASTK FW I KQKP I S I DSDLLCAC \ DLAEE 


6676 


277 


1678 


GNW PTERMAFLDNPT 1 1 LAHI RQSHVTSDDTGMCEMVL I DHD VD 
LBKIHPPSMPGD3GSEIQGSNGETQGYVYAQSVDITSSWDFG1R 
RRSNTAQRLERLRKERQNQIKCKNIQWKERNSKQSAQELKSLFE 
KKSbKEKPP ISGKQS ILSVRIiEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKK I D V YLPLHS SQDRLL PMTWTMAS ARVQDL I GLI CWQ 
YTSEGREPKIiNDNVSAYCLHI AEDDGE VDTDFPPLDSNE PI HKF 
GFSTLALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KE I LLKAVKRRKGSQKVSGSRADGVFEEDSQIDI ATVQDMLSSH 
HYKSFKVSMIHRDRFTTDVQL/GC^FPGVLRKRAAPVDCLRPS 
ADTWRQEQ I GCCGAACAALRS *DSHKC+EGI SGDKVEIDPVTNQ 
KASTKFWIKQKPISIDSDLLCAC\DLAEE ~ 


6677 


277 


1678 


GNWPTERMAFLDNPTI I LAHIRQSHVTSDDTGMCEMVL IDHDVD 
DEKIHPPSMPGDSGSEIOGSNGETQGYVYAQSV0ITSSWDFGIR 
RRSNTAQRLERLRKERQNQI KCKN I QW KERNS KQS AQELKSIiFE 
KKSLKEKPPI SGKQSI LS VRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHS SQDRLLPMT WTMAS ARVQDLIGI*! CWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPIiDSNEPIHKF 
GFSTIALVEKYSSPGLTSKESLFVRINAAIK3FSLIQVDNTKVTM 
KE I LLKAVKRRKGSQKVSGSRADGVFEEDSQ1DI ATVQDMLSSH 
HYKS FKVSM I HRLRFTTD VQL/GCALFPGVLRKRAAP VDCLRPS 
ADTWRQEQIGCCGAACAALRS *DSHKC* EGISGDKVEIDPVTNQ 
KAS TKFW I KQK PIS IUS DJLLCAC \DLAEE 


6678 


221 


865 


GPSNQSSGSLSLIVTGCSSYWS*INDTCTILRVLSSNFGRQ*LR 
PFPCSQLPMSQGCLWHLDCCCPWVPYIPGQQWRKGRQRMRN*QS 
LLGSDQES VGLEDLCVFVNFLLHVLLGLFP* PHELFLLP WDLG 
FLFPLLU3GGCHCLVLPANLVSQAPQIGKLSCRLQTHDLEGSRN 
HHPIiFLWGRWDAVKHLETVQSGLASLGFVGQHTSHGPP 


6679 


2 


786 


LEFARGAMPFLGQDWRS PGQNWVKTVDGWKRFLDEKSGS FVSDL 
SSYCNKE VYNKENLFNS LN YD/SCSQEEKEGHAE *QNQNS \DFH 
QEKWIYVHKGSTKERHGYCTLGEAFNRLDFSTAILDSRRFNYW 
RLLEL I AKSQLTSLSG IAQKNFMNI LEKVVLKVLEDQQNITLIR 
ELLQTLYTSLCTLVKRVGKS VLVGNI NMWVYRMETI LHWQQQLN 
NIQ I TR VS GQAQP P PGSGS LHRDTGQTRQDFEFTP VTEESGLF 


6680 


1498 


2951 


PLCTLPLMPSALPGWAGERWEKQWPIiA/ PGPGTWQTPVGS ISEE 
P \ RKNEPDTHC PRGE AR PE V * HLPKPHS PGSEGAE I QTSA* AL P 
/NQVS PPQPM *GAEENGDQRGGKEEAGEELHRSSSGLTAAPGF? 
EVHRNLQTFPGLPSRGGGP /GGAGTQGSWAPGEQPP/ SPLLPAS 
MQRSQAGLPG WEAGLVES PTHHI PALRPSGTNATGEAFPSTTCS 
SGP\ PAP PGPTGLRPGGGS S SGGHG * * PGLPVGKV\GALGAAQD 
PQSQGRG PTQGTVGTEMLLSGLGS AKAC PAARPAVP * LPSDPAS 
TIPKKGTRGFGEGPGVLQERNRWWGRAQGFTSADAAGTAPPGV 
* LPAPLSQPPGATEPQVRACGMAPPS PGTSGRLVANGRHPGPQV 
AQGCPPGAGCWGSQPRGSQRCPRTYTHSPLGHGRAPCPRRCWH* 
WQDP PSS PRTGCLPGI PARQAYSAPRTRSRPG I RTGRAAYGF IR 
FQGGGGG 
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SEQ 
ID 
NO: 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^ Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=sisoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Thr eonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6681 


1169 


511 


I N Y I Y YNQQQRAFHE LK \EKLMS APALGLPDLTKLFTLH VS ERE 
KMTVGVLTQTVGPWSRPGAYLSKQLLX3VSKGWPPCPRALAATAL 
LAQEADELTLRQNLNRKS PHA\ WTLINTKGHH * L I NARLTR YQ 
TLLCENPH KT 1 EVSNT/ LN PATLLLVTESP VKHNCLE VLDS VYS 
SRPNLRDHP * TS VDWELYVDGSGFANPCKVTLKKETS PAPVTPR 
S 


6682 


109 


1238 


T VLCGAMQVS SLNE VKI YSLSCGKS LPE WLS DRKKRALQKXDVD 
VRRR IEL I QD FEM PT VCTT I KVSKDGQ Y I LATGT YKPR VRCYDT 
YQLS LKFERC tiDSE WTFE I LS DD YS KI V FLHNDRY I E FHSQSG 
FYYKTR 1 PKFGRDFS YHYPSCDLYFVGASSEVYRLNLEQGRYLN 
PLQTDAAENNVCD INS VHGLFATGTr EGRVECWDPRTRNRVGLL 
i vi>yy lyK* X &>LF1 XoAIjKFN \GAIiTMAVGTTTGQVLIiY 
DLRS DKPLLVKDHQ YGLP I KSVHFQDSIjDL I IiSADSR I VKMWNK 
NSGKIFTSLEPEHDLNDVCLiYPNSGMLIiTANETPKMGIYYIPVL 
GPAPRWCSFLDNLTEELEENPESNE 


6683 


109 


1238 


i vij^^i^iuvasijNiiVAi xauscx>KHijJb , KWLSDRXKRAI*QKKDVD 
VRRR I EL I QD FEMPTV CTTI KVS KDGQYI LATGT YKPRVRCYDT 
YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIE FHSQSG 
FYYKTRIPKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
PLQTDAAENNVCDINSVHGIiFATGTIEGRVECWDPRTRNRVGLIi 
D\AP * T VSQQ I QR * TSLPT I SALKFN \ GALTMAVGTTTGQVLLY 
DLRSDKPLLVKDHQYGLPIKSVHFQDSLDLILSADSRI VKMWNK 
NSGKI FTSLEPEHDLNDVCLYPNSGMLLTANETPKMG IYYIPVL 
GPAPRWCS FLDNLTEELEENPESNE 


6684 


111 


527 


GLRGGTSRGRAGREPEFAAGVLCWAGFCQSPCPPGGRGREAPA 
w \^^KKiiA w KPA*WLGGPGGDSGGREEGGS/GELQRAMESKMG 
ELPLDINIQEPRWDQSTFLGRARHFFTVTDPRNLLLSGAQLEAS 
RNIVQNYR 


| 6685 


258 


1473 


KLLGDNFEGFCNKFELSDSENGSNS * QS PIAFDRLFDPDPQKVL 
CGVIDMKNAVIGNNKQKANLIVLGAVPRLLYLLQQETSSTELKT 
c-v->* v v i |£NJN VK-bijljDCHX J. PALLQGLLS PDLKFIEAC 
LRCLRTIFTSP VTPEELLYTDATVI PHLMALLSRSRYTQEYI CQ 
IFSHCCKGPDHQTILFNHGAVQNIAHLLTSLSYKVRMQAIjKCFS 
VLAFENPQVSMTLVNVLVDGELLPQI FVKMLQRDKP I EMQLTSA 
KCLTYMCRAGAIRTDDNC I VLKTLPCXiVRMCS KERLLEERVEGA 
ETLAYLIEPDVELQRIAS I TDHLI AMLADYFKY P SS VSAI TDIK 
RLDHDLKHAHELRQAAFKLYASLGANDEDIRKKVSLGPGR PPVT, 
TASRQGVTST 


6686 


310 


927 


DSVTFDDLAVDFTPKEWTLLDPTQRNLYRDVMLENYKNLATVGY 
QLFKP S L IS WLE QEESRT VQRGDFQAS EWKVQLKTKEIiALQQD V 
LGEPTS SGIQM I GSHNGG E VSDVKQCGDVSSEHS CLKTHVRTQN 
SENTFE CYLYG VD FLTLHKKTSTGEQRS VFSHVWKKPS SLNPDV 
VCQKNRCTRKKKAF* LQLTLGKSFH* S I HT 


6687 


181 


915 


EAMLEAPYKKEEDEQQRKEVKKDYPSNTTSSTSNSGNETSGSST 
IGETSNRSRDRDRYRRRNSRSRSPGRQCRHRSRSWDRRHGSESR 
SRDHRREDR VHYRS PPLATGE PVDNLS PEERDARTVFCMQLAAR 
IRPRDLEDFFSAVGKVRDVRI I SDRNS RRSKGIAYVE FCE IQS V 
PLAIGLTGQRLLGVP I IVQASQAEKNRLAAMANNLQKGNGGPMR 
LYVGS LHFNI TEDMLRGI FEPFGKV 


6688 


1025 


1 


AEVPIO'PRVFHKCPDSCWRFKFQPIQLQPYILLSFSSEKPPISF 
SEPGLPR/S ATARMATAAAPPNSS IDLPSDSGMGFI SPAGDSLD 
LPSDGGTGFFSLAGDSSSTRLSSLAFISFSLSSVSVGSSAGTTS 
STSVGSWAAFTSSSSSSTNRDVAGLDFSTVITSVSGSLVPSRE 
VAVICG S KGAGASGS ASCSSRAGKTTEATAASSMPSGTSSFS TC 
TMSELEELFSLFSPAPLLS KL FTSSGS I A I CCQDSG PS DTGRLS 
VCQLWLADSDTGKLSDCQEVVTVGDSGGLTCPELSLGRM*MSLL 
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WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

-L. a. l. j_ lji x 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F.= Phenyl alanine, G=Glycine, 
H^Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline; Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tr*yptophan , Y=Tyrosine, XsUnknown, *=Stop 
Codori, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SSAVIPGYSSSSDSRLNTVPTVDLLCPFQTKSST 


6689 


640 


1299 


SSSASYATSATSISDTAFSGSLKLKHGLLSALDSSSRTS*STSS 
AEDSTFR I CS PS VSDTSSDSSGS KDNVI>I LFSKVSI * S CFSLSS 
FFSDSISFCFSSSSFCKR* FVS S KVSQNALLS SRLS NG PGGS SK 
QRNSLTARQLAMSL* ATKF * RNACNPNCLSSKKSAL* LSLNQRF 
GGSASRKPGNIS FNSQKCS ALS YCCNFVI KPREVS VSS ENYPAF 


b b y u 


1 


442 


GTRGKMAATLGPLGSWQQWRRCLSARDGSRMLLLIiLLLGSGQGP 
QQ VGAGQTFE YLKREHS LS K P YQG VGTGS S S LWNLMGN AM VMTQ 
YIRLTPDMQSKQGALWNRVPCFLRDWELQVHFKIHGQGKKNL\H 
GDGLAI WYTKDRMQP 


6691 


287 


1401 


LKTETSEEKARRYKDRPSQLNAVFQEQKKMIQAQESITLEDVAV 
DFTWE EWQLLGAAQKD L YRDVMLENYSNLVAVG YQA5 KPDALFK 
LBQGEQIiWTIEDGIHSGACSDIWKVDHVLERLQSESLVNRRKPC 
HEHDAFEN I VHCS KSQFLLGQNHD I FDLRGKS LKSNLTL VNQS K 
GYEI KNSVEFTGNGDS FLHANHERLHTAI KFPASQKLISTKSQF 
I S P KHQKTRKLEKHHVCS E CGKAF I KKS WLTDHQ VMHTGEKPHR 
CSLCEKAFSRKFMLTEHQRTHTGEKPYECPECGKAFLKKSRLNI 
HQKTHTGEKPY I CSECGKGFIQKGNLI VHQRI HTGEKP Y I CNEC 
/ GKG F I QKTCLIAHQR FHTER 


6692 


178 


939 


WI KEGELSLWERFCANI IKAGPMPKHIAFIMDGNRRYAKKCQVE 
RQEGHSQGFNKLAETLRWCIJJLGILEVTVYAFSIENFKRSKSEV 
DGLMDLARQKFSRLME EKE ICLQKHG VC I R VLGDLHLLPLDLQEL 
IAQAVQATKN YNKCFLNVCFAYTSRHE I SNAVREMAWGVEQGLL 
DPSD I SESLLDKCLYTNRS PHPDILI-RTSGEVRLSDFLLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6693 


178 


939 


WI KEGELSLWERFCANI I KAGPMPKHIAFIMDGNRR YAKKCQVE 
RQEGHSQGFNKLAETLRWCLNIiGI LEVTVYAFS I ENFKRSKSEV 
DGLMD LARQKFS RLMEE KEKLQ KHG VCIRVLGDLHLLPLDLQEL 
IAQAVQATKN YNKCFLNVCFAYTS RH EI SNAVREMAWGVEQGLL 
DPSD 1 SESLLDKCLYTNRS PHPDI LI RTSGEVRLSDFLLWQTSH 
SCLVFQPVLWPE YTFWNLFEAI LQFQMNHSVLQK 


6694 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
E VHS LGQ I LPQDGLTAEAG PPEAQDPWGS PGI SL PAAH I G FAAA 
LAVG PSG CHTE P \ FDE VWP SL FLGDAYAARDKS KL I QLG I THW 
NAAAG KFQ VDTGAKFYRGMS LE Y YG I E ADDNP FFDLS VYFL P 


6695 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
E VHS LGQ I LPQDGLTAEAGP PEAQDP WGS PG I SLPAAH I GFAAA 
LAVG P SG CHTE P \ FDEVWPS LFLGDAYAARDKS KLI QLG I THW 
NAAAG KFQ VDTGAKFYRGMS LEYYG I EADDNPFFDLS VYFLP 


6696 






PRVRGRVG&KWAFLSVPAAMSSEMEPLLLAWSYFRRRKFQLCAD 
LCTQMLEKSPYDQAAWILKARALTEMVYIDEIDVDQEGIAEMML 
DENAI AQVPRPGTSLKLPGTNQTGGPSQAVRPITQAGRP ITG FL 
RPSTQSGRPGTMEQAIRTPRTAYTAR P ITSSSGRFVRLGTASML 
TSPDGPFINLSRLNLTKYSQKPKLAKALIEYI FHHENDVKTALD 
LAALS TEH SQ YKD WW WK/DQ IEKC Y YR VGM YRE AE KQ I KS S 


6697 


3 


782 


PPLFLRRLNSRALRPGSRKVMAWPASLSGQDVGSFAYLTIKDR 
IPQILTKVIDTLHRHKSEFFEKHGEEGVEAEKKAISLIiSKLRNE 
LQTDKPFIPLVEKFVDTDIWNQYLEYQQSLLNESDGKSRWFYSP 
WLLV\ ECYMYRRIHEAI \ I QS PPIDYFDVFKESKEQNF YGSQES 
I IALCTHLQQLI RTI EDLD \ ENQLKDE F FKLLQ I SLWGE I S VDL 
SL\SGGES S SQNTNVLNSLEDLKPF I LLNDMEHLWS LLSNCK 


6698 


668 


754 


VGSCACAGSCKCKECKCTS CKKSECRAFP 


6699 


325 


492 


EGELP / PARR VLPRAMTAS AQ PRGRRPG VGVG VWTS C KHPRC V 
LLGKRKGSVGAGSFQLPGGHLEFGETWEECAQRETWEEAALHLK 
NVHFASVVNSFIEKENYHYVTIIJ4KGEVDVTHDSEPKNVEPEKN 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine r X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ESKRIIYNHAFFFQESKWSGGILQ 


6700 


1098 


1392 


TQCWRSSTPGMRTHFRTQP/RLECGQGFSQQENGHCMDTNECIQ 
FPFVCPRDKPVCVNTYGSYRCRTNKKCSRGYEPNEDGTACVERT 
LLLGI.CNL.LGK 


6701 


2 


1485 


AAAG PRTR VRRAAAFEGQ PS PS PGLG PTSDKAAAP RTP KRRRLW 
RQRQ / HPAMLC YVTRPDAVLMEVE VE AKANGEDCLNQVCRRLGI 
I EVDY FGLQFTGSKGESLWLNLRNR I S QQMDGLAP YRLKLRVKF 
FVEPHLI LQEQTRHIFFLH I KE ALLAGHLLCS P EQAVE LS ALLA 
QTKFGDYNQNTAKYNYEELCAKELSSATLNSIVAKHKELEGTSQ 
ASAEYQVLQIVSAMENYG I EWHSVRDSEGQKLLI GVGPEGI S IC 
KDDFS P I NRI AYPWQMATQSGKNV YLTVTKESGNS IVLLFKM I 
STRAASGLYKAlTETHAFYRC3>TVTSAVMMQYSRDLKGHIiASLF 
LNENINLGKKYVFDI KRTSKEVYDHARRALYNAGWDLVSRNNQ 
SPSHS PLKSSES SMNCSSCEGLSCQQTRVLQEKLRKLKEAMLCM 
VCCEEEINSTFCPCGHTVCCESCAAQLQVGESAAHFCLQPHLSL 
LLTGSRSQVLAR 


6702 


397 


1971 


PLAKFLKLDLVNVLCLPMEDVTLFYRTCFCSMGLGSSCHLSLPK 
RAEALL CSRKATWRDLVAVRMAEEQE FTQLCKLPAQPSHPHCV 
NNTYRSAQHSQALLRGLLALRDSGILFDWLWEGRHIEAHRIL 
LAASCDYFKGMFAGGLKEMEQEE VLIHGVS YNAMCQI LHFIYTS 
ELELSLSNVQETLVAACQLQI PEI IHFCCDFLMSWVDEENI LDV 
YRLAELFDLSRLTEQLDTYILKNFVAFSRTDKYRQLPLEKVYSL 
LSSNRLEVS CETEV YEGALLYHYSLEQ VQADQ I S LHE PPKLLET 
VRFPLM EAE VLQRLHDKLDP S PL.RDTVAS ALMYHRNES LQPS LQ 
S PQTELRSDFQCWGFGGIHSTPS\MS SATRPKYLNPLLGEWKH 
FTASLAPRMSNQG I AVLNNFVYLIGGDNNVQGFRAESRCWR YD P 
RHNRWFQ I QSLQQEHADLS VC WGR Y I YAVAGRD YHNDLNAVER 
YDPATNS WAYVAPLKREVYAHAGATLEGKMY ITCGRKGR I T 


6703 


45 


1244 


GVGPRAAAMPLELELCPGRWVGGQHPCFIIAEIGQNHQGDLDVA 
KRM1RMAKECGADCAKFQKSELEFKFNRKALERPYTSKHSWGKT 
YGEHKRHLEFSHDQYRELQRYAEEVG I F F TASGMD EMAVE FLHE 
LNVP FFKVGSGDTNNFP YLEKTAK/ TRG WHS VLRDVCGVQLNDE 
TSSWDVLGRVRTS KEKVI^MVLVLDYSGRPMVISSGMQSMDTMKQ 
VYQIVKPLNPNFCFLQCTSAYPLQPEDVNLRVISEYQKLFPD I P 
I G YSGHETG IAIS VAAVALGAKVLERH I TLDKTWKGSDHSASLE 
PGELAELVRS VRLVERALGS PT KQLLP CEMACNEKLGKS WAKV 
KI PEGTI LTMDMLTVKVGE PKG YPP ED I FNLVGKKVLVTVEEDD 
TIMEE 


6704 


82 


1007 


TMNTRNRWNS GLGAS PASRPTRDPQDPSGRQG ELS P VEDQREG 
LEAAPKGP SRE S VVHAGQRRTSAYTLjIAPN INRRNB IQR I AEQE 
LANLEKWKEQNRAKPVHLVPRRLGGSQSETEVRQKQQLQLMQSK 
YKQKLKREESVRIKKEAEEAELQKMKAIQREKSNKLEEKKRLQE 

nlrreafrehqqyktaefl/rqtehr iarq kcls kcclwpti ln 

MU\2KJjV3 Jjy \ JJoLiKAJb bNKKMJKM KD r*LLELKRQQQEQE 
RAKIHQTEHRRVNNAFLDRIiQGKSQPGGLEOSGGCWNMNSGNSW 
GI 


6705 


2 


786 


RLCRNSARVPCGWSASRSLGEGAGFIGPLRGPHPRAGGTGTSFT 
SYKRXGGIMSTIAAFYGGKSILITVATGFLGKELMEKLFRTSPD 
LKVIYILVRPKAGQTLQHRVFQILDSKLFEKVIEVRPNVHEKIR 
Al YADIiNQNDFAI S KEDMQELLS CTNI I FHCAATVRFDDTLRHA 
VQLNVTATRQLLLMASQMPKLEAFIHI STAYSNCNLKHI DEVI Y 
PCPVEPKKIIDSLEW\LDDAIIDEITPKI*IRDWPNIYTYTK 


6706 


130 


531 


FTHSSSSHSQEMLGKLNMLRNDGHFCDIT IRVQDKI FRAHKWL 
AACSDFFRTKLVGQAEDENKNVLDIJIHVTVTGFIPLLEYAYTAT 
LS INTENI IDVLAAAS YMQMFS VASTCSEFMKSSILWNTPNSQP 
EK 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

Cor re spon ding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A~Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=sValine, 
W=Tryptophan, Y=Tyrosine, X=Un known, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6707 


2233 


1343 


YWSGIGYELQHFHWRKFHFEKKGPPSTCQBRLYESRSRWPCIS^ - ' 

GMVWGWTAVNGSW*GGQLRCVCVCTSHSSDSTRSSQRASKCHS 

FFILSQ*KT*SSWENWVFAKYSRIYSYGHSCSKGRGD*DFK*NV 

SQAR * SRFCGLCNPCGHCGLDINLRGGSSPWTDKHSCVHNNLLC 

NRRVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEH 

TD*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYA 

C * RCHWY FE WLL YNHCGDI LVACL+ RRQL* S SQ 


6708 


115 . 


1729 


TVGS WSRSGRS PPVGRQLLLTGRGAQAAGS PQGGMAI.QVE LVPT 
GE I IRVVHPHRPCKLALGSDGVRVTMESALTARDRVGVQDFVLI. 
ENFTSEAAFIENI>RRRFRENLIYTYIGPV1,VSVNPYRDT^IYSR 
QHMERYRGVSFYEEPPHLLAVADTVYRALRTERRDQAVMISVES 
GAGKTDATKRLLQLYAETCPAPQRGGAVRDRLLQSNPVLEAFGN 
AKTLRNDNSSRFGKYMDVQFDFKGAPVGGKILSYLLEKSRWHQ 
NHGERNFHIFYQLLEGGEEETLRRLGLERNPQSYLYLVKGQCAK 
VSSINDKSDWKWRKALTVIDFTEDEVEDLLSIAASVLHLGNIH 
FAANEESNAQVTTEKQLKYLTRLLSVEGSTbREALTHRKIIAKG 
E ELLS PLNLEQAAYARDALAKAV YS RTFTWLVGK I NRS LAS KD V 
ESPSWRSTTVLGLLDrYGFEVFQHNSFEQFCIWYCNEKLQQLFI 
ELTLKSEQEE YEAEG I AWEPVQYFNNKI ICDLVEEKFKG 1 1 \S I 
LDE\ECLRPGE 


6709 


3 


894 


PPHEHLFPSGERGPFSFLVSRRGLGPGKMGKKGKKEKKGRGAEK 
TAAKMEKKVSKRSRKEBEDLEALIAHFQTLDAKRTQTVELPCPP 
PSPRLNASLSVHPEKDELILFGGEYFNGQKTFLYNELYVYNIRK 
DTWTKVDI PSPP PRRCAHQAWVPQGGGQLWVFGGEFASPNGEQ 
FYHYKDLWVLHLATKTWEQVKSTGGPSGRSGHRMVAWKRQLILF 
GGFHESTRDYIYYNDVYAFNLDTFTWSKLSPSGTGPTPRSGCQ\ 
I PSLPRAAS S VYGGYSKQRVKKDVDKGTRHSDMP 


1 6710 


158 


980 


RHKMTNYRVESSSGRAARKMRLALMGPAFIAAIGYIDPGNFATN 
IQAGASFGYQLLWWVWANLMAMLIQILSAKLGIATGKNLAEQI 
RDHYPRPWWFYWVQAEIIAMATDLAEFIGAAIGFKLILGVSLL 
QGAVLTGIATFLI LMLQRRGQKPLEKVIGGLLLFVAAAY I VELI 
FSQPWLAQLGKGM VIPSLPTSEAVFLAAGVL \GATIMPHVX / YI 
WHS SLTQHLHGGS RQQRYS ATKWDVA I AMTIAGF VN LA I MATAA 
SELNFYGHTGVA 


6711 


3 


347 


VTECKTMTCKMSQLERNI*TMINTLHHYSVKLGHPDTLIHGEFK 
ELVRTDLHN I LMXBNKNDQAI +H I MEDLDTNAHMQ II FKEL IML 
MAMLTWSYHDNMHDADYGPGQQHRPG | 


6712 
6713 


118 


578 


PHGQKRTRYPQVRAPGQQPQAQLAMALCLKQVFAKDKTFRPRKR 
FEPGTQRFELYKKAQASLKSGLDLRSWRLPPGENIDDWIAVHV 
VD FFNR I NL I YGTMAERCS * TS CP VMAGG PR YE YR WQDERQ YRR 
PAKLSAPRYMALLMDWIESLI 




2485 


3 


QARGS DS EDG E FE IQAEDDARARKLGPGR PLPTFPTS ECTSDVE 
PDTREMVRAQNKKKKKSGGFQSMGLS YPVFKG IMKKGYKVPTP I 
-i J-t v i.uiAjfli/v v«i»i/\k iv7ooJ\.iAv_r LtUf Mr bKIjKTHSAQTG 
ARALILSPTRELALQTLKFTKELGKFTGLKTALILGGDRMEDQF 
AALHENPDI I IATPGRLVHVAVEMSLKLQS VEYWFDEADRLFE 
MGFAEQLQEI IARI .PGGHQTVLFSATLPKLLVEFARAGLTEPVL 
IRLDVDTKLNEQL KTS FFLVREDTKAAVLLHIXHNVVRPQDQTV 
VFVATKHHAEYLTELLTTQRVSCAHIYSALDPTARKINLAKFTL 
GKCSTLI VTDLAARGLDI PLLDNV1NYS FPAKGKLFLHRVGRVA 
RAGRSGTAYSLVAPDEIPYLLDLHLFLGRSLTLARPLKEPSGVA 
GVDGMLGRVPQS WDEEDSGLQSTLEAS LELRGLAR VADNAQQQ 
YVRSRPAPSPES I KRAKEMDLVGLGLHPLFSSRFEEEELQRLRL 
VDS I KNYRSRATI FEINASSRDLCSQVMRAKRQKDRKAIAR FQQ 
GQOGRQEQQEGPVGPAPSRPALQEKQPEKEEEEEAGESVEDIFS 
EWGRKRQRSGPNRGAKRRREEARQRDQEFYIPYRPKDFDSERG 
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SBQ 
ID 
NO: 


" Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L-Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S ^Serine, T=Threonine, V= Valine, 
W^Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








LS I SGEGGAFEO^AAGAVLDLMGDEAQNLTRGRQQLKWDR KKKR 
FVGQSGQEDKKKIKTESGRYISSSYKRDLYQKWKQKQKID*S*L 
GRRRGILTRRRPRTEEVGEARPIJVQAGCIPGPHAPRHPLQAESA 
LELKTKQQ I LKQRRRAQ KAALS LQRWWPQ AALCPQ 


6714 


169 


1416 


NNCQELLP P P PAPMAH I PSGGAPAAGAAPMG PQ YCVCKV ELS VS 
GQNLLDRD VTSKSDP FCVLFTENNGRW I EYDRTETAINNLN P AF 
SKKFVLDYHFEEVQKLKFAIiFDQDKSSMRLDEHDFLGQFSCSLG 
T I VS S KKI TRPLLLLND KPAGKGL I T IAAQELS DNRVITLSLAG 
RRLDKKDLFGKSDPFLEFYKPGDDGKWMLVHRTEVIKYTLDPVW 
KPFTVPLVSliCDGDMEKPIQVMCYDYDNDGGHDFIGEFQTSVSQ 
MCEARDSVPLEFECINPKKQRKKKNYKNSGI I ILRSCKINRD YS 
FLDYIIX3GCQLMFTVGIDFTASNGNPLDPSSriHYINPMGTNEYI, 
SAIWAVGQIIQDYDSDKMFPAIX3FGAQLPPDWKVSHEFAINFNP 
TNPFCSGVDGIAQAYSACLP 


6715 


32 


493 


GPAGAESGSLHCLPAWQALAGAAHSPHGGQPPRRGPLIGSGMP 

gkpkhlgvpngrmvlavsdgelssttgpqgqgegrgsslsihsl 
psgpsspfpteeqpvaswalsferllqdplglayfteflkkefs 
aenvtfwkacerfqqi pasdt 


6716 


1 


176 


gaggpaprsfgseepraalerdkmsaraaaaks tameetai weq 

HTVTLIIRVSLCCSK 


6717 


115 


896 


LFAMSGFENIiNTDFYQTSYSIDDQSQQSYDYGGSGGPYSKQYAG 

ydysqqgrfvppdmmqpqqpytgqiyqptqaytpaspqpfygnn 
fedepplleelginfdhiwqktltvlhplkvabgsimnetdlag 
pmvfclafgatlllagkiqfgyvygi sajgcxgmfcllnlmsmt 
GVS fgcvas vlg ycllpmi llss favifslqgmvgi iltagi ig 

WCSFSASKIFISALAMEGQQLLVAYPCALLYGVFALISVF 


6718 


290 


599 


KQSSWPGTILPSLKWHNSGLCKFPETGGKMTTFKEGt.TFKDVA 
VIFTEEEIiGLLDPVQRNiYQDVMLENFRNLIiSVGHHPFKHDVFIi 
LEKEKKLDIMKTATQ 


6719 


1 


691 


Pl'RPEEQDREDGKCHKMEMNPISGNLNC^PIAMSQCSSDHGCET ""' 
DLDSDDDKIEKPNNFMKDSASQDNGLSRKISRKRVCSSDSDSSL 
QVVKKSS KARTGLLRITRRCAATAANKI KLMSDVEDVSLENVHT 
RSKNGRKKPIJILACTTAKKKLSDCEGS VHCE VPS EQ YACEGKP P 
DPDSEGSTKVLSQALNGDSDSEDMIiNSEHKHRHTNIHKIDAPSK 
RKSSSVTSSG 


6720 


3 


822 


HEVAEEAGGTV YPQRGTMPGTKRFQHVI ETPE PGKWELTGYEAA 
VPITEKSNPLTQDLDKADAENIVRLLGQCDAEIFOEEGQALSTY 
QRI>YS ES I LTTMVQVAGKVQE VLKEPDGGLVVLSGGGTSGRMAF 
LMS VS FNQLMKGLGQKPIiYTYLIAGGDRSWASREGTEDSALHG 
IEELKKVAAGKKRVIVIGISVGLSAPFVAGQMDCCMNNTAVFLP 
VLVGFNPVSMARH PFP PPR I LRS LTVFPS LRAPH YQ I TSLLFSM 
SWTLISE 


6721 


3 


822 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPEPGKWELTGYEAA 
VP ITEKSNPLTQDLDKADAENI VRLLGQCDAEI FQEEGQALSTY 
QRLYSES I LTTMVQVAGKVQEVLKEPDGGLWLSGGGTSGRMAF 
LMSVS FNQLMKGLGQKPLYTYLIAGGDRS WASREGTEDSALHG 
I EELKKVAAGKKRVI VI G I S VGLS AP FVAGQMDCCMNNTAVFLP 

VLVGFNPVSMARHPFPPPRILRSLTVFPSLRAPHYQITSIiLFSM 
SWTLISE 


6722 


1 


390 j 


RSWSKRTWQALPMAVLFLLLFLCGTPQAADNMQAIYVALGEAVE 
LPCPSPSTLHGDEHLSWFCSPAAGSFTTLVAQVQVGRPAPDPGK 
PGRE S RLRLLGNYS L WLEGS KE ED AG R YW CAVLG QHHNYQNW 


6723 "" 


173 


659 


VCQYCTARMADFG I SAGQFVAVVWDKSS PVEALKGLVDKLQALT 
GNEGRVSVENIKQLLQSAHKESSFDI ILSGLVPGSTTLHSAEIL 
AE I AR I LRPGGCLFIiKEPVETAVDNNSKVKTAS KLCS ALTLSGL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Iieucine, M-Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 

VE VKELQKE PLTPEEVQS VREHLGHESDNL ~ " 


6724 
6725 


173 


659 


VCQYCTARMADFG I S AGQF VAWWDKSS PVEAJLKGLVDKLQAliT 
GNEGRVSVENIKQLLQSAHKESSFDIILSGLVPGSTTLHSAEIL 
AEIARILRPGGCLFLKEPVETAVDNNSKVKTASKLCSALTLSGL 
VEVKELQRE PLTPEEVQSVREHLGHESDNL 


6726 


356 


722 


RRRTPPVl IjATMDDDLMLAIjRLQEE WNK?EAERDHAQES LStfVD 
ASWELVDPTPDLQAIiFVQFNDQFFWGQLEAVEVKWSVRMTLCAG 
I CS YEGKGGMCS IRIiSEPLIjKLRPRiCDLVEVFFV 


6727 


98 


714 


HLQ KMERK I.NRREKEKE YEGKHNS LEDTDQX3 KNCKSTIjMTLNVG 
GYLYITQKQTLTKYPDTFIiEGIVNGKIIjCPFDADGHYFIDRDGL 

lfrhvi^flrngelllpbgfrenqliaqeaeffqlkglaeevks 

RWEKEQLTPRETTFIiE ITD^JHDP cnrT.u t p^mb nncr'c u--r t 

vlvsksrldgfpeefsissniiqfkyfik 


6728 


1 


831 


frgmgderphyygkhgtpqkydptfkgpiynrgctdiiccvfll 
laivgyvavghawthgdprkviyptdsrgefcgqkgtknenkp 
ylfyfnivkcasplvllefqcptpqicvekcpdryltylnarss 

RDFEYYKQFCVPGFKNNKGVAEVLRDGDCPAVLIPSKPIiARRCF 
PAIHAYKGVLMVGNETTYECKSHGSRKNITDLVEGAKKAWGVLEA 
RQLAMRIFEDYTVSWYWDI ISLGIAMAMSLLFIILLRFLAGIMG 
RGMIIMGILVLGY 


6729 


486 


935 


fcs s w l rs laosslis wkm fl»vg1»tgg i asgks s v i qvfqql.gca 

lifnqpdrrqlltxaithpeirkemmketfkyflreprtsprgkk 
hvpsalkeadslmrrdt 


6730 


259 


1191 


VGLTGAQSGRTAS MGRDQRAVAGPAIiRRWLLLGT VTVGFIiAQS V 
lAGVKKFDVPCGGRDCSfifiCnr r V vr vnm r>r\or mirm-u^v 

GLQGFPGIjCKSRKGDKGERGAPGVTGPKGDVGARGVSGFPGADGI 
PGHPGQGGPRGR PG YDGCNGTQGDSGPQGPPGS EGFTGPPGPQG 

pkgqkgepyalpkeerdryrgepgepglvgfqgppgrpghvgqm 

GPVGAPGRPG P PGPPGPKG QQGNRGLGFYG VKGEKGDVGQPGPN 

GIPSDTLHPIIAPTGVTFHPDQYKGEKGSEGEPGIRGISLKGEE 
GIM 


6731 


784 


1015 


NMVDYYEVLGLQRYASPEDIKKAYHKVALKWHPDKNPENKEEAE" 
RKFKEVAEAYEVLSNDEKRD1 YDKYGTEGLNEF 


6732 


1 


446 | 


GIRKRLHGAWPRVEVGCPWETRESPfJVWT .PT? dtcdt VMMnorri — 

LDIYAGLDSAVSDSASKSCVPSRNCLDLYEEILTEEGTAKEATY 

NDLQVEYGKCQLQMKELMKKFKEIQTQNFSLINENQSLKKNISA 
LIKTARVE INRKDEEI 




102 


1205 


GRWQPJlPPPpsPPLWCI^PGGGSDPQQIiTQLRHCLSHSPQDTPW 
AQRQ VC YTAATTQAAAPATRNCLPDHSGHRPTP PR S HRHHRQEN 
LGSIKPSSRSTKATSTTMAGDGRRAEAVREGWGVYVTPRAPIRE 
GRGRliAPQNGGS SDAPAYRTP PSRQGRRE VR FS DE P PE VYGDFE 
PLVAKERS PVGKRTRbEE FRS DSAKEE VR ESAY YLRSRQRRQ PR 
PQETEEMKTRRTTRLQQQHSEQPPLQPSPVMTRRGLRDSHSSBE 
DEASSQTDLSQTISKKTVRSIQEAPAVSEDLVIRLRRPPrJlYPR 

YEATSVQQKVNFSEEGETEEDDQDSSHSSVTTVKARSRDSDESG 
DKTTRSSSQYIESFW 


| 6733 
6734 


613 


1311 


RSCRQVGMRSRWQGGESASDGHISCPKPS I IGNAGEKSLSEDAK 
KKKKSNR KEDDVMAS GTVKRHI*KTSGE CER KTKKS LELS KEDL I 

QLLSIMEGELQAREDVrHMLKTEKTKPEVLEAHYGSAEPEKVLR 
VLHRDAILAQEKSIGEDVYEKPISELDRLEEKQKETYRRMLEQIj 
LI^KCHRRTVYELEWEKHKHTDYKNKSDDFTNLLEQERERLKK 
LLEQEKAYQARKE 




189 


551 


SAAMFPVFSGCFQELQEKNKSLELVSFEEVAWFTWEEWQDLDD 
ftQRTLYRDVMLETYSSLVSLGHCITKPEMIFKLEQGAEPMIVEE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=AJLanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, ,/=possible nucleotide deletion, 
\-possible nucleotide insertion) 








TLNLRLSGGSKKQVFSGICHRSLVELQEVHIjV 


6735 


280 


558 


KSRRAGVTKMSNP FLKQ VFNKDKTFR P KRKFE PGTQRFELHK KA 
QASLNAGLDLRLAVQLPPGEDLNDWVAVHWDFFNRVNLIYGTI 
XDGCT 


6736 


195 


808 


MN YELNFKREMPN I KS LGLTNLN FLLKRLS S VLPL I TD YVYFEN 
SSSNPYLIRRIEELNKTASGNVEAKWCFYRRRDISNTLIMLAD 
KHAKE IEBES ETTVEADLTDKQKHQLKHRELFLSRQ YES L.PATH 
IRGKCSVALLNETESVLSYIiDKEDTFFYSLVYDPSLKTLLADKG 
EIRVGPRYQADIPEMLLEGTFFCVFAVL 


6737 


150 


1209 


PVIMPLHFSPGDIVRPSCCVSSSPKLRRNAHSRIiESYRPDTDLS 
REDTGCNLQHI SDRENIDDLNMEFNPSDHPRASTI FLS KSQTDV 
REKRKS LF INHHPPGQ I AR KYSS CSTI FLDDSTVS QPNLKYT I K 
CVALAI Y YH I KNRBPDGRMLLD I FDENLH PIiS KS EVP PDYDKHN 
PEO KOI YRFVRTLFSAAOliTAF.C A T UTLVYT ,pr t .T .TV a p m t r»i> 
ANWKRIVLGAI LIiASKWDDQAVWNVDYCQI LKDITVEDMNELE 
RQFLELLQFNINVPSSVYAKYYFDLRSLAEANNLSFPLEPLSRE 
RAHKLEAISRLCEDKYKDLRRSARKRSAS ADNLTLPRWS PAI IS 


6738 


14 8 


653 


CACAEQPARAE VGAATAb PVRWASGEMAPSGS LAVPLAVLVIJbli 
WGAP WTHGRRSNVR VI TDENWR'RT.T «F(?nWMT P T?VA purn j\ rrarr 

QPEWESFAEWGEDLEVNIAKVDVTEQPGLSGRFIITALPTIYHC 
KDGEFRRYQGPRTKKDFINFISDKEWKSIEPVSSWF 


6739 


3 


631 


SWPDKIAEEEVAKLEKHI^MI^RQEYVKI^KKIAETEKRC^LAAQ 

ANKESSSESFISRLLAIVADLYEQEQYSDLKIKVGDRHISAHKF 

VIJtftflSDSWSIJtf*LSSTKEIJ}L^DANPEV™^ 

REDD VFLTELMKLANR FQLQUjRERCE KGVMSLVNVRNC IRFYQ 

TAEELNASTLMNYCAE I IASHWV^EVFfSVMTf »T. 


6740 


3 


631 


S WPDMAEEE VAKI»E KHLMLLRQE YVKLQKKLAETE KRGALLAAQ 
ANKESSSESFISRLIiAIVADLYEQEQYSDLKIKVGDRHISAHKF 
VLAARSDSWSLANLSSTKELDI^DANPEVTMTMU?WIYTDEI,EF 
REDDVFLTEIWKLAmFQLQLU^RCEKGVMSLVNVRNCI RFYQ 
TAEELNASTLMNYCAEI I ASHWVS EVEGVNKAL 


6741 


141 


960 


PLTLP FSSRARAGHTMNTS PGTVGSDP VTLATAG YDHT VR FWOA 
HSG I CTRTVOHODSOVNALEVTPDRSMI AAAVfiP V^t^ydp t t? m 
YDLNSNNPNPI IS YDGVNKNIASVGFHEDGRWMYTGGEDCTARI 
WDLRSRNIiQCQR I FQVNAP I NCVCLHPNQAELI VGDQSGA IH IW 
DLKTDHNEQLI PE PE VS I TS AHI DPDAS YMAAVNSTLVPFS CLIi 
PLAIG II^EGEFESLARRGLI»FI*ACQGNCYVWNI»TGGIGDEVTQ 
LIPKTKIP 


6742 


141 


960 


PLTLPFSSRARAGHTMNTSPGTVGSDPVILATAGYDHTVRFWQA 
HSGICTRTVQHQDSQVNALEVTPDRSMIAAAVQPVSLGYQHIRM 
YDLNSNNPNP I IS YDG VNKNTAS VGFHEDGRWI^TGGEDCTAR I 
WDLRSRNLQCQRI FQVNAP INCVGLHPNQAELI VGDQSGAIHI W 
DLKTDHNEQLIPE PEVS I TSAHIDPDAS YMAAVNSTLVPFS CLL 
PLA IG I LQEGE FE SLARRGLLFLACQ/3NC!YVWNLTGGIGDE VTQ 
LIPKTKIP 


6743 


1 


412 


MHSTQDKSLHLEGDPNPSAAPTSTCAPRKMPKRI S ISKQLASVK 
ALRKCSDLEKAIATTALIFRNSSDSDGICLEKAIAKDIiLQTQFRN 
FAEGQETKPKYREI LSELDEHTENKLDFEDFMILLLS ITVMSDL 
LQNIR * 


6744 


95 


1343 ■ 


RTPARNRCAGCE VLSRFSS PNKASS FALQSAGGGI.PAVRALRRD 
RQKVSTVG YGMDEVEQDQHEARL KELFDS FDTTGTGSLGQEELT 
DLCHMLSLEEVAPVLQQTIoLQDNLIiGRVHFDQFKEALI LILSRT 
LSNEEHFQEPDCSLBAQPKYVRGGKRYGRRSLPEFQESVEEFPE 
VTVIEPLDEEARPSHI PAGDCS EHWKTQRSEEYEAEGQLRFWNP 
DDLNASQSGSSPPQDW I EEKLQEVCEDLG I TRDGHLNR KKLVS I 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine . C=Cvstein*» n— zv<srvs*>-F -5 a <-. -; n» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, " 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine , R^Arginine, 
S=Serine, T=Threonine ( V= Valine 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








CEQYGLQNVDGEMLEEVFHWLDPDGTMSVEDFFYGL,FKNGKSL,T 
PSASTP YRQLKRHLSMQS FDESGRRTTTSSAMTSTIGFRVFSCL 
DDGMGHAS VERI LDTWQEEG I ENSQE I L JCALDFGLDGNTNL TEL 
TLALENELLVTKNS IHQACI 


6745 


1 


588 


TFRIX^GWAQRRRWLLGCASWESWEAAIAAGPGLPSSTARQQNNP " 
AAGTEC FAAVW ARGTAMGS VI*S TDSGKS APR aT a pqt t?t> ucno 

ELPVTSFDCAVCLEVLHQPVRTRCGHVFCRSC1ATSLKNNKWTC 
P YCRA YLPS EG VPATDVAKRMKS E YKNCAECDTIjVCLS EMRAH I 
RTCQKY I DKYG P LQE LEETA 


6746 


110 


492 


GATGAMAESAPARHRRKRRSTPLTSSTLPSQATEKS S YFQTTEI " 
VEFGNQLEGKWAVLGTLLQEYGLLQRRIiENVENIiLRNRN 


6747 


247 


484 


EAVTFKDVAWFTEEEI^LLDIAQRKItYRDVMLENFRNIiLSVGH ' 
QPFHRDTFHFLREE KF WMMD I ATQREGNS V YAGVC 


6748 


201 


665 


1UU «vg/\v ± r ajjvavvj? jl iiiiJc,iA»jjjjUir'Ay KKLi YRDVMLENFRNL 
LS VGNQPFHQDTFHFLGKEKFWKMKTTSQREGNSGGKI Q IEMET 
VPEAGPHEEWSCQQI WEQI ASDLTRSQNS IRWSSQFFKEGDVPC 
Q I EARLS I S X VQQXPYRCNECKQ 


6749 


95 


7X9 


RREVKGGDGVCPRARGSPQSQQFPSCAGGGEGLQQSGEALDGAM " 
SAGGPCPAAAGGGPGGASCSVGAPGGVSMFRWLEVLEKEFDKAF 
VDVDLLLGE I DPDQAD I T YEGRQKMTS LSSCFAQLCHKAQS VSQ 
INHKLEAQLVDLKSELTETQAEKWLEKEVHDQLLQLHSIQLQL 
HAKTGQSADSGTIKAKLSGPSVEELERELKAN 


6750 


3 


428 


SCESRRPGAKWVWASGALPRDTTGLGSEQPSGDVAQSNRATMGT 
fin jjijfcijuuy KLiMi*]? LiCNMDNKDD VWLEE I QEEAERMFTR 
EFS KEPELM PKT PSQKNRRKKRR I S YVQDENRDP I RRRLSRRKS 
RSSQLSSRR 


675X 


152 


1417 


PTKATEMAGAS VKVAVRVRP FNS REMSRDSKCI IQMSGSTTTIV 
NPKQPKETPKSFSFDYSYWSHTSPEDINYASQKQVYRDIGEEML 
QPIAFEG YNVCI FAYGQTGAGKS YTMMGKQEKDQQGI 1 PQLCEDL 
FSR INDTTNDNMS YS VE VS YMEI YCER VRDLLNPKNKGNLRVRE 
HPLIiGPWEDLSKLAWSYlTOIQDLf^DSGNKARWAATNMNETS 
SRSHAVFNIIFTQKRHDAETNITTEKVSKISLVDLAGSERADST 
GAKGTRLKEGANINKSLTTLGKVI SALiAEMDSGPNKNKKKKKTD 
FIPYRDSVLTWLLRENI^GNSRTAMVAALSPADINYDETLSTLR 
YADRAXQIRCNAVINEDPKITKLIRELKDEVTRLRDLIjYAQGLGD 

itdmtnalvgms pssslsalssrnv 


6752 


24 


1834 


rncvpplgcyrsrvkfhsdikmqyshhcehllerlnkqreagfl ~ 

VKAIXJFQKLLEFIYTGTl^DSWNVKEIHQAADYLKVEEVVTKC 
KIKMEDFAFIANPSSTEISS ITGNI ELNQQTCLLTLRDYNNREK 

sevstdliqanpkqgalakkssqtkkkkkafnspktgqnktvqy 
psdilenasvelfldanklptpweovaqindnseleltswen 
tfpaqdivhtvtvkrkrgksqpncalkehsmsniasvkspyeae 
nsgeeldqryskakpmcwtcgkvfseasslrrhmrihkgvkpyv 
chlcgkaftqcnqlkthvrthtgekpykcelcdkgfaqkcqlvf 
hspj^hhgeekpykcdvcnlqfatssnlkiharkhsgekpyvcdr 
cgqrfaqastltyhvrrhtgekpyvcdtcgkafavssslithsr 
khtgekpficelcgnsytdiknlkkhktkvhsgadktldssaed 

HTLSEQDS IQKSPLSET^VK^SDMTLPLALPLGTEDHHMLLPV 
TDTQSPTSDTLLRSTVNG YSEPQLI FLQQLY 


6753 


2 


1305 


VPSLPYPPQKWAHTEFTTSSDSETANGIAKPDPVMPGGEEKAS ' 

pfgiklrrtnyslrfncdqqaeqkkkkrhsstgdsadagppaag 
sargekemegvalkhgpslpqerkqapstrrdsaepsssrsvpv 

AHPGPPPASSQTPAPEHDKAAKIWPIAQKPALAPKPTSQTPPAS 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
curreaponaing 
to first 
amino acid 
residue of 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I»=Leucine, M=Methionine, N=Asparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine , V^Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon. /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLSKLSRPYIiVELLSRRAGRPDPBPSEPSKEDQESSDRRPPSPP 
GPEERKGQKRDEEEEATERKPASPPIiPATQQEKPSQTPEAGRKE 
KPMLQSRHSLDGSKLTEKVETAQPLWITLALQKQKGFREQQATR 
EER KQAREAKQAEKDS KENVSVS VQPG SS S VS RAGSLHKSTALP 
EEKRPETAVSRLERREQLKKANTLPTSVTVEISYSSPAAP1.VKE 
VSKRFSSPDDAPVSSEPAWLAIAKRKAKAWSDCPLIIK 


6754 
67S5 


2 


413 


F VRRRRRRLGG P E VNTMSSLHKSR I AD FQDV1»KE PS I ALEKLRE " 

LSFSGI PCEGGLRCLCWKIIiLNYLPLERASWTS I LAKQREI* YAQ 

FLREMIIQPGIAKANMGVSREDVTFEDHPLNPNPDSRWNTYFKD 
NEVLL 




298 


1343 


PGLQUJVALEADWFLDMPGGRRGPSRQQLSRSALPSLQTLVGGG " 
CGNGTGLRNRNGSAIGLPVPPITALITPGPVRHCQIPDLPVDGS 
LLFEFLFFIYLLVALFIQYINIYKTVWWYPYNHPASCTSLNFHL 
I DYHLAAF I T VMLARRIiVWADI S EATKAGAAS M I H YMVL I S ARIi 
VLLTLCGWVLCWTIiVNLFRSHSVLNIiliFLG YP FGVY VPLCCFHQ 
DSRAHLLLTDYNYWQHEAVEESASTVGGLAKSKDFLSLLLESL 
KEQ FNNATP I P THS CPLS PDDIRNE VECLKADFNHR I KE VI, FNS 
LFS AY YVAFLP LCF VKVSG Y L.TFMCFLDLCVNY INW VFLV 


6756 


180 


754 


IERALGSDPLS I PVSWGSLRTLKYQQQPLRPKVI,LCQTRVQCHD~^ 
IJ^SLQPQPPGLKQSFCLRVLGLQTGATTPGIiRDLTCKELIIJbTE 
REAQKRKKRKEKESGMALTQGPDTFRDVAIEFSQEEWKSLDPVQ 
KAL Y WD VMIiENYRN LVFLG KDNFAli E VKI CP RVF L Y FLCCI»S W E • 
PFHYLTETEALLTHK 


6757 


2 


459 


NSRVEAPEAHSRESQGSDAMRKHLSWWWLATVCMLLFSHLSAVQ 
TRGIKHRIKWNRKALPSTAQITEAQVAEMRPGAFIKQGRKLDID 
FGAEGNRYYEAN YWQFPDG IHYNGCSEANVTKEAFVTGCINATQ 
AANQGEFQKPDNKLHQQVLW 


6758 


1 


1008 


ASGPELPGRRFRDRAPWDPARLLRGVXAVWVSLSADGPGSFCRR 
RVPSLAQLGHSEAAPSPDDVRWSRVPDRCPEERDRAWPPPPPPS 
LPPS FRRNMANNS P AJLTGNS QPQHQAAAAAAQCX3QQCGGGGATK 
PAVSGKQGNVLPLWGNEKTMNLNPMILTN ILSS P YFKVQLYELK 
TYHE WDE I YFKVTHVE P WE KGSRKTAGQTGMCGG VRG VGTGG I 
VSTAFCLLYKDFTIjKIjTRKQVMGL ITHTDS P Y IRALGFMYIRYT 
QP PTDL WD WFES FLDDEEDLDVKAGGG CVMT IGBMDRSFLTKliE 
WFSTLFPRI PVPVQKNIDQQI KTRPRKI 


6759 
6760 


1 


513 


RKHNFHSLDGTSTRAFHPQTGLPDnSSPVpQRKTQSGCFDIiDSS 
LLHLKSFSSRSPRPCDNIEDDPDIHEKPFLSSSAPPITSbSDLG 
NFEESVLNYRFDPLG X VDG FTAE VGASGAFCPTHLTLP VEVS FY 
S VSDDNAP S P YMG VI TL.ESLGKRG YRVP PSGTIQWCVL 




239 


606 


VIjSKKKGLSAEEKRTRMMEI FSETKDVFQLKDLiEKIAPKE^JT ' 
AMS VKEVLQSLVDDGMVDCER IGTSNYYWAFPSKALHARKHKLE 
VLESQLSEGSQKHASLQKS I EKAKIGRCETEERT 


6761 


29 


1733 


ERTLRGLREVAAPSDVADAAVSRRGRCCCCLHCTQTQVAQDCPS " 

vyK^j^jj^iaJf y5>JjH L MTSKKuVNSVAGCADDALAGLVACNP 
NLQLLQGHRVALRSDLDSLKGRVALLSGGGSGHEPAHAGFIGKG 
MLTGVIAGAVFTSPAVGS ilaairavaqagtvgtlli VKNYTGD 
RLNFGLAREQARAEG I PVEMWIGDDSAFTVLKKAGRRGLCGTV 
LIHKVAGALAEAGVGLEEIAKQVNVVTKAMGTLGVSUSSCSVPG 
S KPTFEI»SADBVELGIjG I HGEAGVRR I KMATADE I VKLMtiDHMT 
NTTNAS HVP VQPGSS WMMVNNLGGLS FLE LG I IADATVRSLEG 
RG VKIARALVGTFMS ALEMPG I S LTI»3L»LVDE PLLKL IDAETTAA 
AWPNVAAVSITGRKRSRVAPAEPQEAPDSTAAGGSASKRMALVL 
ERVCS TLLGLEEHIiNAI,DRAAGDGDCGTTHSRAARAI qe WLKEG 
PP PAS PAQLLS KLSVLLLEKMGGS SGAI» YGLFI/TAAAQPLKAKT 
SLPAWSAAMDAGLEAMQKYGKAAPGDRTMLDSLWAAGQBL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L- Leucine, M=Methionine, N=Asparagine, 
P*=Proline, Q:=Glut amine, R^Arginine, 
S^serine, T-Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6762 


3 


613 


ASTISWRLCVAGAEARRPVPVAGERAGGGAMWFMYLLSWLSLFI 
QVAF I TLAVAAGLYYLAELIEE YTVATSR 1 1 KYM IWFS TAVL IG 
L YVFERFPTSM IGVGLFTNIiVYFGLLQTFPFI MLTS PNFI LSCG 
LWVNHYLAFQFFAEEYYP FSEVLAYFTFCLW 1 1 PFAFFVSLSA 
GENVLPSTMQPGDDWSNYFTKGKRGK 


6763 


2 


760 


SGPD FPGRR FRGCCC VRP PAGAGME LGGHWDMNS APRLVS ETAE 
RKQEQKTGTEAEAADSGAVGARRFLLCLYLGGFLDLFGVSMWP 
LLSLHVKSLGASPTVAGIVGSSYGII^LFSSTLVGCWSDVVGRR 
SS I*LACI LLSALGYLLIjGAATNVFL FVLAR VPAG I FKHTLS I SR 
ALLSDWPEKERPLVIGHFNTASGVGFILGPWGGYJ.TELEDGF 
YLTAF ICFLVFI LNAGLVWFFPRREAKPGSTE 


6764 


8 0 


438 


LKKMDTMMLS VRNLFEQLVRRVE ILSEGNEVQF IQLAKDFEDFR~ 
KKWQRTDHELGKYKDLLMKAETERSAiDVKLKHARNQVDVE I KR 
RQRAE ADCE KLERQI QL I REMLMCDTSGS I Q 


6765 


3 


550 


AR YSRVDH FCRRRCRAVARAPR FLLQ FPSGPSRHFLAAC VARWIj "" 
RGSVLVSEALSG3AKDGI VTEVAVGVVPfiQDFT.r.^rcvT c? o dmc 

nmssmvvtangndskkfkgedkmdgapsrvlhirklpgevtete i 

VIAI^LPFGKVTNILMLKGK^OAFLEIATEEAAITNGNYYSAVT • 
PHXiRNQ j 


6766 


1 


1287 


EGGS FXASLTWLWPLGEMKLHCE VEVI SRHLPALGLRNRGKGVR | 
AVLSLCQQTS RS OP PVRAFLLiIS TLKDIfRfiTR VPT »P pn tv^vitt ! 

KFVDEGKATVRiKEPPVDICLSKANSSSLKGFLSAMRIAHRGCN j 
VDTPVSTLTPVKTSEFENFKTKMVITSKKDYPLSKNFPYSLEHL < 
QTSYCGLVRVDMRMLCLKSLRKLDLSHNHIKKLPATIGDI.IHLQ 

ELNLNDNHLBS fsvalchstlqkslwsldls knki kalpvqfcq 

I^ELKJJLKIJDDNELIQFPCKIGOLINLRFLSAARNKLPFLPSEF 
RNLSLEYLDLFGNTFEQPKVLPVI kloapltllessarti lhnr 
I P YGS H 1 1 PFHLCQDLDTAKICV CGRFCLNS F I QGTTTMNIiHS V 
AHTWIiVDNLGGTEAP 1 1 SYFCSLGCYVNSSDI 


6767 


336 


913 


APMICLCSSDLQFRYKEAFUU)RGI^IGYCSVDDDPRMKHFLNV~" 
GRIiQSDNEYKKDFAKSRSQFHSSTDQPGLLQAKRSQQLASDVHY 
RQPLPQPTCDPEQIjGLRHAQKAHQLQSDVKYKSDLKLTRGVGWT 
PPGSYKVEMARRAAELANARGLGLQGAYRGAEAVEAGDHQSGEV 
NPDATE I LHVKKKKALLL 


6768 . 
6769 


2 
284 


363 
396 


PGS TI S C YIi LSEGSLPLCMQ VACX3EEKHRAPTMKTLRARFKKTE 
LRLSPTDLGSCPPCGPCPIPKPAARGRRQSQDWGKSDERLLOAV 
ENNDAPRVAALIARKGLVPTKLDPEGKSAFHL 


6770 


1 


397 


MSTPDFSTAENNQELANEVSCiJCAMIiTLMLQAMGQAD 
QRNYQVIWSSTMAKIJiDYYKDEVVKKLMTEFNYNSVMQVPRVEK 
ITl^MGVGEAIADKKIiDNAAADLAAISGQKPLITKARKSVAGF 
KIRQGYPIGCKVTLRGERMWEFFERLITIAVPRIRDFRGLSAKS 


6771 


3 


3 78 


Ap AGTLAMTGKS VKDVDR YQAVLANLLLEEDNKFCADCQSKG PR 
WAS WN I GVF I C I RCAG IHRNLG VH I S R VKS VNL D Q WTQEQ I QCM 
QEMGNG KANRLY EAYLPETFRR PQ I DP YLFWSNLEG 


6772 


1 


1400 


AAAFLQGMT VNG F INT VI TS I* \ ERR YDLHS YQSGIiI ASS YD I AA 
CLCLTFVSYFGGSG\HKPRWLGWGR\VLMGTGSLVFALPHFTAG 
P* *GWKLDAGVRTCPANPR\ P VC7VG\HTSGLSRY0LVFMLGQFL 
HGVGATPLYTLG VT YLDENVKSSCSP I Y IAI FYTAAI LGPAAG Y 
L I GGAL LN I YT EMG R RTELTT ES PL WVGAWWVG FLG S GAAAF FT 
AVPILGYPRQLPGSQRYAVMRAAEMHQLKDSSRGEASNPDFGKT 
IRDLPLSIWLLLKNPTFILLCLAGATEATLITGMSTFSPKFLES 
QFSLSASEAATLFGYLWPAGGGGTFLGGFFVNKLRLRGSAVI K 
FCLFCTWS LLG I L VFSLHCP S VPMAGVTAS YGGSLL PEGHLNL 
TAPCNAACSCQPEHYSPVCGSDGLMYFSLCHAGCPAATETNVDG 
QKVYRDCSCIPQNLSSGFGHATAGKCTST 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - " 
(A^Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, Ns=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion.) 


6773 


1 


630 


PWEAPKEHKYKAEEHTWLTVTGEPCHFPFQYHRQLYHKCTHKG - 
RPGPQPWCATTPNFDQDQRWGYCLEPKKVKDHCSKHSPCQKGGT 
CVNMPSGPHCLCPQHLTGNHCQKEKCFEPQLLRFFHKNEIWYRT 
E QAAVARCQCKG PDAHCQRLASQACRTNPCLHGGRCLE VEGHRL 
CHCPVGYTG?FCDVGE*GSGASRRPAPRWDGLAR 


6774 


146 


389 


LTELSDQQYFLFFI LS S / WVPTFLSMDVDGRVIKADS FS KI I SS 
GIiRIGFIiTGPKPLIERVILKIQVSTLHPSTFNQLMISQ 


6775 


104 


| 614 


TC PSQLRVLTARGGRRAPS PQLWTLVLALIEE KWRSHRI LRMNS 
GRPETMENLPALYTI FQGEVAMVTDYGAFI KI PGCRKQGLVHRT 
HMSS CRVDKPSEI VD VGDKVWVKL IGREM KNDR I K VS LS MKWN 
QGTG KDI*DPNNV \SLSKKRGGGDPSR ITLGRRSPLRLS 


6776 




1108 


. HERHERHEGALSQDALLRIS I PLDSNMRPEKCRRFVHPQWQLLH 
| LNGTFPKTSDADMEPCVDGWVYDRISFSSTIVTEWDLVCDSQSL 
; TS VAKFVFMAGMMVGG ILGGH LS DRFGRR FVLRWC YLQVAI VGT 
CAALAPTFLIYCSLRFLSGIAAMSLITNTIMLIAEWATHRFQAM 
GITLGMCPSGIAFMTLAGLAPAIRDWHILQLWSVPYFVIFLTS 
SWLLESARWLIINNKPEEGLKELRKAAHRSGMKNARDTLTLEIL 
KSTMKKELEAAQKKKPFLGERLHMPNICKRISLLPFTKFANFKA 
YFGLNbHG/LKHLGNNVFLLQTLFGAV/TPPGQLVLHLGHWGSG 
RVSSRGRVNCLGLFVLQVW 


6777 


779 


63 

- 


CFFHGPAWRDCEVRATFAKKQGQSGIISCIAFSPAQPLYACGSY 
GRSLGLYAWDDGSPLALLGGHQGGITHLCFHPDGNRFFSGARKD 
AELLCWDLRQSG YPLWS LGRE VTTNQR I YFDIiDPTGQFIjVS GST 
SGAVS VWDTDG PSNDGKPEPVLS FLPQKDCTNG VSLHPSLPLLG 
HCLPVS VCFLS PTESGGRRRGAGPSLGS PRRHVHLECRLQLWWC 
GGGARLQHP+ * SPRARKGR 


6778 
" 6779 


"SIT 


805 


IQS I TDESRGS I RRKNPANTRLRLNVP \ EBTAGDSE / ERSPEEE 
VQADPRIRSASPKCPTSSPFPKGRS PEGEGET\ DPEKVHFHPGP 
KDKSVAEKN\KGP\SPVSSEGIKDFFSMKPEWENLNQSKVRRMH 
T\AVRLNEVIVKKSRDAKLVLLNMPGPPRNRNGDENY 




2 


535 


RALRRQPRLLAANGIEPESMAI SEP I KGSRKPCVNKEELALKKP 
MAKCAWKGPREPPQDARAEAESPGGASESDQDGGHESPPKKKAV 
AWVSAKNPAPMRKKKKVS LGPVS YVLVDSEDGRKKPVMPKKGPG 

SRREASDQKAPRGQQPAEATASTSRGPKAKPEGSPRRATNESRK 
V 


6780 


3 


403 


HEWDNKPEININLMSPGKEEISYIPEGDPIDTFVALVRVQDKD 

SGLNGEIVCKLHGHGHFKLQKTYENNYLILTNATLDREKRSEYS 

LTVIAEDRGTPSLSTVKHFTVQINDINDNPPHFQRSRYEFVISE 
K 


6731 


1 


1269 


APTRPVFPTLQDLSSSKEPSNSLNLPHSNELCSSLVHPELSEVS 
SNVAPSIPPVMSRPVSSSSISTPLPPNQITVFVTSNPITTSANT 
SAALPTHLQSALMSTVVTMPNAGSKVMVSEGQSAAQSNARPQFI 
TPVFINSSS 1 1 QVMKGS QPS TI PAAPLTTNSGLM P PS VAWG PL 
HIPQN I KFSSAPVPPNALSSSPAPNIQTGRPLVLSSRATPVQLP 
SPPCTSSPWPSHPPVQQVKEIiNPDEASPQVNTSADQNTIiPSSQ 
STTMVSPLLTNSPGSSGNRRSPVSSSKGKGKVDKIGQILLTKAC 
KKVTGSLEKGEEQYGADGETEGOGLDTTAPGLMGTEQLSTELDS 
KTPTPPAPTLLKI4TSSPVGPGTASAGPSLPGGALPTSVRS1VTT 
LVPSELI SAVPTTKSNHGG IAS E SLAG 


6782 


3 


1327 


RKPTVIRIPAKPGKCLHEDPQSPPPLPAEKPIGNTFSTVSGKLS 
NVERTRNLESNHPGQTGGFVRVPPRLPPRPVNGKTIPTQQPPTK 
VPPERPPPPKLSATRRSNKKLPFNRSSSDMDLQKKQSNLATGLS 
KAKSQ VFKNQDP VLPPR PKPGHPLYS K YMLSV PHG I ANED I VS Q 
NPGELSCKRGDVLVMLKQTENNYLBCQKGEDTGRVHLSQMKLIT 
PLDEHLRSRPNPFSPPKAPSHAQKPVDSGAPHAWLHDFPAEQV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide - "" 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
"tryptophan, Y=Tyrooine, X-Unknovn, *^Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DDLNLTSGE I VYLLEKI DTDWYRGNCRNQIG I FP AN Y VKV I ID I 
PEGGNGKRECVSSHC VKGSRCVARFE YIGEQKDELS FSEGE III 
LKEYVNEEWARGE VRGRTG I FPLNFVE PVEDYPTSGANVLSTKV 
PLKTKKEDSGSNSQVNSLPAEWCEALHSFTAETSDDLSFKRGDR 


6783 


3 


1750 


S YHHHHAQQSAAAS PNIjTAS Q KTVTTTS M ITTKTLPLVIjKAATA 
TMPAS WGQRPT I AMVTA INS QKAVLSTDVQNTP VNLQTS S KVT 
GPGAEAVQIVAKNTVTLQVQATPPQPIKVPQFIPPPRI,TPRPNF 
LPQVR PKPVAQNNI PIAPAP PPMLAA PQL.IQR P VMLTKFTPTTI, 
PTSQNS I H P VR WNGQTAT I AKTFPMAQLTS I V IATPGTRLAG P 
QT VQLS KPSLEKQTVKSHTE TDEKQTES RTI TPPAAPKP KRE EN 
PQKLAFMVSLGLVTHDHLEEIQSKRQERKRRTTANPVYSGAVFE 
PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVLGFGALTPTSPQS 
SHPDSPENEKTETTFTFPAPVQPVSLPSPTSTDGDIHEDFCSVC 
RKSGQLLMCDTCSRVYHLDCLDPPLKTIPKGMWICPRCQDQMLK 
KEEAI P W PGTLAI VH S YI A Y KAAKEEE KQKLLKWSS DLKQEREQ 
LEQK V KQLSNS I S KCMEMKNT I LARQKEMHSSLE KVKQLI RL I H 
G I DLS KP VDSEATVGAI SNG PDCTP PANAATSTPAP S PS SQS CT 
ANCNQGEETK 


6784 


3 


1750 


S YHHHHAQQSAAAS PNLTASQKTVTTTSMITTKTLPIjVIiKAATA 
TMPASWGQRPTIAMVTAINSQKAVLSTDVQNTPVNLOTSSKVT 
GPGAEAVQIVAKNTVTLQVQATPPQPIKVPQFIPPPRLTPRPNF 
LPQVRPKPVAQNNI PIAPAPP PMLAAPQLIQRP VMLTKFTPTTL 
PTSQNS I HPVRWNGQTATI AKTFPMAQLTS I VI ATPGTRLAGP 
QTVQLSKPSLEKQTVKSHTETDEKQTESRTITPPAAPKPKREEN 
PQKLAFMVSLGLVTHDHLEEIQSKRQERKRRTTANPVYSGAVPE 
PERKKSAVTYXNSTMHPGTRKRGRPPKYNAVLGFGAIjTPTSPQS 
SHPDSPENEKTETTFTFPAPVQPVSLPSPTSTDGDIHEDFCSVC 
RKSGQLLMCDTCSRVYHItDCIiDPPIiKT I PKGMWI CPRCQDOMLK 
KEEAI PWPGTLAI VHS YIAYKAAKEEEKQKLLKWSSDIjKQERBQ 
IjEQKVKQLSNSISKCMEMKNTILARQKEMHSSLEKVKQLIRLIH 

GIDLSKPVDSEATVGAISNGPDCTPPANAATSTPAPSPSSQSCT 
ANCNQGEETK 


678S 


1 


528 


lgntvlh ycsmys kpeclklldrs kptvdi vnqagetald iakr 
lkatoc^dllsqaksgkfnphvhveyewni.rqeeidesdddldd 

KPSPVKKERSPRPQSFCHSSSISPQDKLALPGFSTPRDKQRLSY 
GAFTNQIFVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 


1820 


1397 


RSPKVLVLAPTRELANHVSRDFKpiXTRKLTVARFYGGTSYQSQ 
INHIRNGIDILVGTPGRIKDHLQSGRLDLSKLRHVVLDEVDQML 
DLGFAEQVEDI IHES YKTDSEDNPQTLLFSATCPQWVYTVA\ KK 
YMKS R YEQVDLDG KMTQKAATTVEHLA IQCHWSQRP AVI GDVLQ 
VYSGSEGRAI IFCETKKNVTEMAMNPH IKQNAQCIiHGDIAQSQR 
E I TLKGFREGSFKVLVATNVAARGLDI PEVDDVIQSS PPQDVES 
Y I HRSGRTGRAGRTG I CICF YQPRERGQLR YVEQKAGITFKRVG 
VPS TMDL VKS KSMDAI RSLAS VS YAAVDFFR PSAQRI»I E EKGAV 
DALAAALAHI SGASS FEPRSL ITSDKGFVTMTLESLEE IQDVSC 
AWKELNRKLS SNAVS Q I TRMCLLKGNMGVCFD VPTTESERLQAE 
WHDSDW I LSVPAKLPEIEEYYDGNTSSNSRQRSGWSSGRSGRSG 
RSGGRSGGRSGRQSRQGSRSGSRQDGRRRSGNRNRSRSGGHKRS 
FD * VF YHIjVDFLSD FLVDS VYIiTGRQ I DHLTGLTGL I DHLTSHS 
SVWN 


6787 


2646 


2270 


pssfpknvpleeleeppk*krsglgsltpksqiqngp*pqtfff 
felgspsgvisahcnlrllgssdspapasrvagiigtchhawli 
lvflvbmgfhhvgqaglklltlXvihppwppkvlglqt 


6788 


16 


936 


GGTVDLR \DMI»AVS VLAAVRGGR/ATVRRVRESNVIjHEKSKGKT 
REGAEDKMTSGDVLSNRKMFYLLKTAFPSVQINTEEHVD \ ELDQ 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
n-nxs>ui.ainc, i-iDUieucins , K=LiySXne, 
L=Ijeucine, M=Methionine, N=Asparagine r 
P^Proline, Q=Glutamine, R^Arginine, 
SsSerine, T~ Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 

V=DOSSlble rmf!l pnh "i Hp lTiq^rh-inn) 








EVIIiVIGS * DS*GYPKGK* LLPKEVPSR/RVLLSGLTPIjDATQE\ 
FTEDLS K\ YVTTMVCVAVNGKPMLG VI HKP FSEYTAWAMVDGGS 
NVKARSSYNEKTPRI WSRSHSGMVKQVALQTFGNQTTI I PAGG 
AGYKVTjALLDVPDKSQEKADLYIHVTYIKKWDICAGNAILKAJLG 
GHMTTLSGEE I S YTGSDG IEGGLLAS I RMNHQALVRKLPDLE KT 
GHK 


6789 


2 


678 


GNGINVLKI APESAI KFMAYEQ I KRLVW * * PGDS * GF/ YERLVA 
GSLAGAIAQSSIYPMEVLKTRMAbRKTGQYSGMLDCARRILARE 
GVAAFYKGY VPNMLG 1 1 PYAGI DLAVYETLKNAWLQHYAVNSAD 
PGV FVLLACGTMS S TCGQLAS YP LALVRTRMQAQAS I EGAPEVT 
MSS LFKH I LRTEGAFGL YRGLAPNFMKVI PAVS IS YWYENLKI 
TLGVQSR 


6790 


2 


4068 


AP PAGR RRMQAAPRAGCG AALLLW I VS S CLCRAWTAPSTSQKCD 
EPLVSGbPHVAFSS SSS ISGSYS PG YAKI NKRGGAGGWS PSDSD 
HYQWLQVDFGNRKQ I SAIATQGRYSSSDWVTQYRMLYSDTGRNW 
KPYHQDGN I WAFPGNINSDG WRHELQHP 1 1 ARYVRI VPLDWNG 
EGR IGLRI EVYGCS YWADVINFDGHVVLP YRFRNKKMKTLKDVI 
ALNFKTS ESEG V I I>HGEGQQGDY ITLBLKKAKLVLS LNLGSNQL 
GPIYGHTS VMTGSLLDDHHWHSWIERQGRS INLTLDRSMQH FR 
TNGEFD YLDIiDYE I TFGGIP FSGKPSSSSRKNFKGCMES INYNG 
VNITDLARRKKIiEPSNVGNLSFSCVEPYTVPVFFNATSYLHVPG 
Rl^QDLFS VS FQFRT WNPNGLIjVFSHFADNIjGNVE I DLTESKVG 
VHINITOTKMSQI0ISSGSGLNEX3QWHEVRFLAKEN FAI LTIDG 
DEASAVRTNSPLQVKTGEKYFFGGFLWQMNNSSHSVLQPSFQGC 
MQLIQVDDQLVNI>YEVAQRKPGSFANVS IDMCA1 IDRCVPNHCE 
HGGKCSQTWDSFKCTCDETGYSGATCHNSIYEPSCEAYKHLGQT 
SNYYWIDPTOSGPIiGPLKVYCNMTEDKVWTIVSHDLQMQTPVVG 
YNPEKYSVTQLVYSASMDQISAITDSAEYCEQYVSYFCKMSRL1. 
NTPEX3SPYTWWVGKANEKHYYWGGSGPGIQKCACGIERNCTDPK 
YYCNCDADYKQWRKDAGFLSYKDHLPVSQWVGDTDRQGSEAKL 
SVGPLRCQGDRNYWNAASFPNPSSYIjHFSTFQGETSADISFYFK 
TLTPWGVFIiENMGKEDFIKLBLKSATEVSFSFDVGNGPVEIWR 

sptplnddqwhrvtaernvkqaslqvdrlpqqirkapteghtrl 
elysqlfvggaggqqgflgcirslrmngvtldleerakvtsgfi 
sgcsghctsygtncenggkcleryhgyscdcsntaydgtfcnkd 

VGAFFEEGMWLRYNFQAPATNARDSSSRVDNAPDQQNSHPDIiAQ 
EEI RFS FS TTKAPC I LLYX SS FTTDFLAVLVKPTGS I*Q I R YNI»G 
GTREP YNT DVDHRNMANGQPHS VNI TRHEKTr FLKLDHYPS VS Y 
HLPSSSDTIiFNSPKSLFLGKVIETGKIDQBIHKYNTPGFTGCLS 
RVQ FNQI A P LKAALRQTN ASAHVH I QGELVESNCGAS PLTIiS PM 
SSATDP WHLDHUDS ASAD FP YNPGQGQAIRNGVNRNSAI IGGVI 
A\ WI FTPSLCTP YVLP * SR*HVS PHKGTLP I PNEAKGAGSRQK 
KPGRRPSMNNDPPTSQRPIDESKKEWPHLRGGYIiAMG 


6791 


1801 


1193 


TGHEGAKGEKGDKGDIjGPRGERGQHGPKGEKGYPGIPPEIj/PGW 
SAW* SWIiTAASTKVQAILLPQPLE * LGLQI AFMASLATHFSNQ 
NSG 1 1 FSS VETNIGNFFDVMTGRFGAPVSGVY FFTFSMMKHEDV 

eevyvylmhngntvfsmysyemkgksdtssnhavlklakgdew 
lrmgngalhgdhqrfstfagfllfetk 


6792 


33 


1073 


VRHTNWGVDM YliFSLGS ES PKGAI GH I VSTEKT I LAVERNKVLL 
PPLWNRTFS WGFDDFS CCLGS YGSDKVIjMTFENIiAAWGR CI»CAV 
CPSPTTIVTSGTSTWCVWELSMTKGRPRGLRIiRQALYGHTQAV 
TCIAASVTFSIiLVSGSQDCTCILWDLDHLTKVTRIiPAHREGISA 
ITISDVSGTIVSCAGAHLSLWNVNGQPLASITTAWGPEGAITCC 
CLMEGPAWDTSQIIITGSQDGMVRVWKT/VGCEDVCSWTASRRG 
APGSASKPKRPQVGEEPGLESRAGR* HCFDREAQQNQP\ PVTAL 
AVSRNHTKLLVGDERGRI FCWSADG* EERGSRGSGTTVPG 
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to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
j nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=s Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\— possible nucleotide insertion) 


6793 


2340 


805 


grkeany \ ygs1/tqagt vslgldaegqevfvp fsavlpmvapnd 
lvfdgwdi s s lnlxaeamrrakviidwglqeqlwphmealrprps v 
yipefiaanqsaradnlipgsraqqleqirrdirdfrssagldk 
viviiwtanterfcevipglndtaenllrtielgi^evspstlfav 
as i legcaflngspqntlvpgalelawqhrvfvggddfksgqtk 

VKS VLVDFL I GSGI*KTMS I VSYNHLGNNDGENLSAPLQ FRS KEV 
SKSNWDDMVQSNPVLYTPGEEPDHCWIKYVPYVGDSKRALiDE 
YTS ELMLGGTNTLVLHNTCEDSLLAAPIMLDIALIjTELCQRVS F 
CTD^PEPQTFHPVLSLLSFIjFKAPIiVPPGSPVVNAIjFRQRSCI 
EN I LRACVGLPPQNHMLLEHKMERPGPSLKRVGP VAATY PMIiNK 
KGPVPAATNGCTGDANGHLQEEPPMPTT*GPGHTVSRLFLPAAP 
HDPTLKAPTNKGRCHFSPPSTWGSWGL 


6794 


169 


1349 


DDVXRKPEASAH * EKPGP PSR PG VRGGRERAGGRGSHGARS CR \ 
EPAPPAPAPPEDHPDEEMGFTIDI KS FLKPGEKTYTQRCRLFVG 
NLPTDITEEDFKRLFERYGEPSEVFINRDRGFGFIRLESRTLAE 
IAKAELDGTI LKSRPLR I RFATHGAALTVKNLS P WSNELLEQA 
FSQFGPVEKAVWVDDRGRATGKGFVEFAAKPPARKALERCGDG 
AFLLTTTPRP V I VEPM EQ FDD EDGIiPE KI^MQKTQQ YHKE REQPP 
RFAQPGTFEFEYASRWKALDEMEKQQREQVDRNIREAKEKXtEAE 
MEAARHEHQLMLMRQDLMRRQEELRRLEELRNQELQKRKQIQLR 
HEEEHRRREEEM I RHR EQEELRRQQEGFKPN YMEW YVCHFIiR 


6795 


1740 


1010 


G PRRQTQ VRD1 IELDS F * D WAAQETDCAQNSG E RIj * KG V / XjEN FS 
TMSKSAVKISLDLLSNPLCEQDQDLLNMVTALDTAMKRMDAFNQ 
EKVNQIQKTVI E PLKKFGS VFPSLNMAVKRREQALQDYRRLQAK 
VEKYEEKEKTGPVLtAKLHQAREELRPVREDFEAKNRQLLEEMPR 
FYGSRIjD Y FQPS FES 1*1 RAQVVYYS EMHKI FGDLS HQLDQ PGHS 
DEQRERENEAKLSEIjRALS I vadd 


6796 


48 


683 


GKEIQI PTIKLAWLLFGLE* PVGALGKGWSF+ * SHVALGQLGW 
LTRAVRSSWRWELCVSAQEWSQRSA+SSPSPVGACPSIjNPPET 
S VQEGRDCWQR* L PRLFSAIjVGQPG cw PQGAP PERCV * PGRCKW 
HIjQSQVLR* ERRRCCRCI*PRFA*GWRRRHQRI*GLG IHPAPLGST 
SPPH PEGNSQQCRR * GWAAELiRLPSS WL * GKLG C * 


6797 


1620 


211 


TERMTPSQPTRGSSCTRPSSMLWTSTWRCLTCHWAGMRMSWGV 
TLGPMAQGLI^ASGTTTEATWTRPTTHLTLIRWWLI*TASRVDPP 
ERPPPPPSDDLTL)UESSSSYKNIj/DAQIPQ/DWSMSPSTSG*RP 
LTSRASS IMRSRTAI PSAS *SRLTTKHTVGGSPSAWRPRPTSRS 
VSTPVSSSTETTASGSCIiTWWSSSPAPCPSSSAPAHSFEASCCK 
TSLWGS CGGSGDGSSACGSGWNI»SMAGTSCSSPAMCS PSRAPS * 
RSASRPRTWRATTSAASSWAPRRCWCGWA*SAT*PSSTTTISSS 
PHCGWPCPASCAS AAAWLSSTWATAS VAGSCWGP IM* SS AHSPW 
CLSACSRSSMGTTCL+RSPPX SGASRAAAAWCGSS PSSTFTPSS 
ASSSTWCS ASSSRSSPAPTTPSS I PAAQAQRRAS CRPTSHSART 
APP PAS S AAGAAR PAAFSAAAEGTPRRS I RCW 


6798 


3894 


1696 


S TI S WE S LES WLNKATNPSNRQED WE Y 1 1 G FCDQ1 NKELEG * VS 
ALWGQLRGSGLGRGTTMAKEGQPGSPRLSALECVIiIiVPQXPQIA 
VRLLAHKI QS PQEWEALQALTYLGDRVS E KVKTKV I ELLYSWTM 
ALPEEAKI KDAYHMLKRQGI VQSDPPI PVDRTLI PS PPPRPKNP 
VFDDEEKS KLLAKLLKSKNPDDLQEANKL I KSMVREDEAR3 QKV 
TKRIJKTLEEVNNNVRI*I*SEMLIjHYSQEDSSDGDREI>MKEI*FDQC 
ENKRRTLFKLASETEDNDNSLGDI LQASDNLSRVINS YKTI I EG 
QVINGEVATLTLPDSEGNSQCSNQGTLIDLAEIiDTTWSIiSSVIiA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAI. 
SWLDEEIiLCLGLADPAPNVPPKESAGNSQWHIjLQREQSDLDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APSAGSSLFSTGVAPAIAPKVEPAVPGHHGLALGNSALHHLDAI. 
DQLLEEAKVTSGLVKPTTS pli ptttparpllpfstgpgs PLFQ 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine , D=Aspartic Acid, 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknovm, *=£top 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








PliSFQSQGSPPKGPEIiSLiASIHVPLESIKPSSALPVTAYDKNGF 
RILFHFAKECPPGRPDVLWWSMLNTAPLPVKSIVLQAAVPKS 
MKVKLQP PSGTELS P FS PIQPPAAI TQVMLLANPLKE KVRLR YK 
LTFALGEQLSTEVGEVDQFPPVEQWGNL 


6799 


3894 


1696 


STISWES1*ESWLNKATNPSNRQEDWEYIIGFCDQINKELEG*VS 
AliWGQLRGSGLGRGTTMAKEGQPGSPRLSALECVIiLVPQ\PQIA 
VRIjLiAHKIQSPQEWEALQALTYLGDRVSEKVKTKVIELIjYSWTP'I 
ALPEEAKI KDAYHMLKRQGIVQSDPP I PVDRTL1PSPPPRPKNP 

VFDDEEKS kllakllksknpddlqeankli ksmvredeariqkv 

TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTLFKLASETEDNDNSLGDILQASDNLSRV1NS YKTI I EG 
OVINGEVATLTLPDSEGNSOCSNQGTIiIDlAELDTTNSLSSVItA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAli 
SWLDEELIiCLGLADPAPNVPPKESAGNSQWHLLQREQSDIiDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APSAGSSLFSTGVAPALAPKVEPAVPGHHGIiALGNSAI.HHLDAl, 
DQLLEEAKVTSGLVKPTTSPLIPTTTPARPLLPFSTGPGSPLFQ 
PLS FQSQG S PP KG PE LSLAS I HVPLES I KPS S ALP VTAYDKNG F 
RILFHFAKECPPGRPDVLWWSMLNTAPIiPVKSIVLQAAVPKS 
MKVKLQPPSGTELSPFSPIQPPAAITQVMLliANPIiKEKVRLRYK 

I/FFALGE qlstevge vdqfpp veqwgnl 


6800 


404 


1646 


RRSPSTGLS PVPQPS SPSLSDYSI P WSLLLSGTIAWATPGK* AG 
* P QAW * LGLAPAI AF I /GLTRGRKQN KEKMAEGGSGD VDDAGD C 
SGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARA 
RATRARRAVQKRASPNSDDTVI»SPQEX«QKVLC1»VEMSEKPYILE 
AAI*I ALGNKAAYAFNRD I IRDLGGL P IVAKI LNTRDP I VKEKAL 
IVKMNLSVNAENQRRLKVYMNQVCDDTITSRLNSSVQLAGLRLIi 
TNMTVTNE YQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAE 
NPAMTRELLRAQVPS SLG\ SLFNKKENKEVI LKLLVI FENINDN 
FKWEENEPTQNOFGEGSIiFFFIiKEFQVCADKVLGIESHHDFIjVK 
VKVGKFMAKLAEHMFPKSQE 


6801 


2 


1755 


SAEEFESQQASVTMHDVDAESFEVliVDYCYTGRVSLSEANVERI. 
YAAS DMLQLE YVREACAS FLARRLDLTNCTA I LKFAD AFGHRKL 
RSQAQS Y I AQNFKQLSHMGS I REETLADLTLAQLLAVLRLDSLD 
VE SEO/IVClIVAVQWIiEAAPKERG PS AAEVF KCVRWMH FTEEDQD 
YXEGIjLTKP I VKKYCLDVIEGALQMRYGDLIi YKSLVP VPNSS S S 
/r* qcxjlsci CSRKSTPETGYVCQGDGDLLWTPQRSLS \RYDPY 
SGDI YTMPS PLTSFAHTKTVTSSAVCVS PDHDI YLAAQPRKDLW 
VYKPAQNSWQQLADRLLCREGMDVAYIiNGYI YILGGRDPITGVK 
LKEVECYSVQRNQWALVAPVPHSFYSFELIWQNYLYAVNSKRM 
LCYDPSHNMWLNCASLKRSDFQEACVFNDE I YCICDI PVMKVYN 
PARGEWRR I SN I PLD S ETHNYQ I VNHDQKLLL I TSTTPQW KKNR 
VTVYEYDTREDQWINIGT^U J GLIiQFDSGFICLCARVYPSCLEPG 
QS FI TE EDDARS ES S TE WDLDG FS E LiDb Ki><3 £> £» 5*5 r t» DDbVWVQ 
VAPQRNAQDQQGSL 


' 6802 


157 


1341 


ETFPLFFFLLSKTPGKTASMAHFVOGTSRMIAAESSTEHKECAE 
PSTRKN LMNSI»EQKIRCLEKQRKELLEVNQQWDQQFRS MKELYE 
RKVAEXKTKLDAAERFLSTREKDPHQRQRKDDRQREDDRQRDLT 
RDRIX3REEKEKERLNEEI^ELKEEKKIjLKGKNTLANKEKEHYEC 
E I KRLNKAIiQDAIiNI KCS FSEDCLRKSRVEFCHEEMRTEMEVLK 
QQVQIYEEDFKKERSDRERLNQEKEELQQINETSQSQLNRLNSQ 
I KACQMEKE KLEKQLKQMYCPPCN CGLVFHLQDPWVPTGPGAVQ 
KQREHPPD YQW YALDQLPPDVQHKAN/ DWCUVPPPVCCQAG/ PR 
TPGLK*SSCLWLPKC*NFRFILSKESPSVEVHTNRERQQATRER 
G 


6803 


1 


2203 


KLSGRPYRHMGVLGTSKLYDIRKTIFTFTPQFIDQQQFYLAJbDN 



554 



BNSDOCID: <WO 0153312A1_I_> 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
{A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyroaine, X=tTnknown r *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KM I VEMLRTDLS Y LCS RWRMTGQ PT I TFP I SHS MLDE DGTSLNS 
S I tiAAJbRKMQDG YFGGAR VQTG KLS E FLTTS CCTHLS FMDPGPE 
GKLYSEDYDDNYDYLESGNWMNDYDSTSHARCGDEVARYLDHLL 
AHTAPHPKIAPTSQKGGLDRFQAAVQTTCDLMSLVTKAKELHVQ 
NVHMYLPTKLFQASR PS FNLLDS PHPRQENQVPS VRVE IHLPRD 
QSGEVDFKAIjVLQLKETSSLQEQADILYMLYTMKGPDWNTELYN 
ERSATVREIiLTELYGKVGEIRHWGLIRYlSGILRKKVEALDEAC 
TDLLSHQKHLTVGLPPEPREKT1SAPLPYEALTQLIDEASEGDM 
SISII.TQEIMVYLAMYMRTQPGLFAEMFRLRIGLIIQVMATELA 
HS LR CS AEEATEG LMNLS PSAM KNLI*HH I I*S G KEFGVE R K/ S VR 
PTDSNVS PAI S IHE IGAVGATKTERTG IMQLKSE IKQVE FRRLS 
I SAESQS PGTSMTPSSGS FPSAYDQQSSKDSRQGCWQRRRRLDG 
ALNRVPVGFYQKVWKVLQKCHGliSVEGFVLPSSTTREMTPGEIK 
FSVHVES\VLNVI>UlPEYRQI*IiVEAILVLTMIiADIEIHS IGSI I 
AVEKI VHlANDliFIiQEQKTLGP \DDTMI±AKDPASG\ I CTLR\ YD 
SAPSGRFGTMTYLS\RAA\ATY VQEFLP\HS I CAMQ 


6804 


1 


951 


GSPGKKEEKAKNKESLCMENSSNSSSDEDEEETKAKMTPTKKYN 
GI*E EKRKSLRTTGF YSGFS EVAE KR I KLLNNS D ERLQNS RAKDR 
KDVWSS IQGQWPKKTLKELFSDSDTEAAAS P PHPAPEEGVAEES 
LQT VAE EE S CS PS VELE KPPP VNVDS KP I E EKTVE VNDRKAE FP 
SSGSNFSA* IPLPYLHLNRIiHQSIi* QKGSRQQSSVTVSEPLAPN 
QEEVRSIKSETDSTIEVDSVAGEIiQDLQSERE* LASRF* CQCKL 
KQ * * SARTRTS * KSL YRSEKSERCSGRRKF1 KKAEKKP * SNSGK 
QQKEGKRHK 


6805 


1539 


206 


RQPDLKYFGKS FDVS VS ESSSLLSNDLPKFADG I KARNRNQNYL 
VPS P VLR I LDHTAFSTE KSADI VI CDEECDS PES VNQQTQEESP 
I E VHTAEDV P I AVE VHAI SEDYDI ETENNSSESLQDQTDEEPPA 
KLCKI LDKS QA2jNVTAQQKWPLI>RANS SGL Y KCEDCE FNS KYFS 
DLKQHMILKHKRTDSNVCRVCKESFSTNMLLIEHAKLHEEDPYI 
CKYCDYKTV I FENLSQHIADTHFSDHLiYWCEQCDVQFSSSSEI»Y 
LHFQEHSCDEQ YLCQFCEHETND PEDIjHSHWNEHACKL lELSD 
K YNNG EHGQ Y S LLS KI T FDKCKN F FVCQ VCG FRS R LHTNVNRHV 
AI EHTKI FPHVCDDCG KG FS SMLE\ I AKHLN S HL*S EG I YLiCQYW 
EYSTGQIEDLKIHLDFKHSADLPHKCSDCLMRFGNERELISHLP 
VHETT 


6806 


272 


3794 


VALCF PNSDP VMFMDAF YGCLaLAEIjG P VP I EVPI/TR KDAGSQQV 
GFliLGS CGVFLAIjTTDACQKGLPKAQTGEVAAFKG WP PLS WJbVI 
DGKHLAKPPKDWHPLAQDTGTGTAYI e ykts KEGSTVGVTVSHA 
SI^QCRALTQACGYSEAETLTNVLDFKRDAGLWHGVLTSVMNR 
MHVVSVPYALMKANPLSWIQKVCFYKARAALVKS RDMHWSLLAQ 
RGQRDVSLSSLRMLIVADGANPWSISSCDAFLNVFQSRGIjRPEV 
ICPCASSPEALTVAIRRPPDLGGPPPRKAVLSMNGLSYGVIRVD 
TEEKLS VLTVQDVGQVMPGANVCVVKL»EGTP YL CKTDE VGE I CV 
oobA I la A A I YOliLfcrlTKNVr EAVPVTTGGAP I FDRPb J. RTGLLG 
F IGP DHLVFI VG KLDGLM VTG VRRHNADDWATALAVE PMKFVY 
RGRIAVFSVTVLHDDRIVLVAEQRPDASEEDSFQWMSRVLQAID 
S IHQVGVYCLALVPANTLPKAPLGG IH I SETKQR FLEGTLHPCN 
Vr^CPHTCVTNLPKPRQKQPEVGPASMIVGNLVAGKRIAQASGR 
ELAHIiEDSDQARKFLFLADVLQWRAHTTPDHPLFLLLNAKGTVT 
STATCVQLHKRAERVAAAL.MEKGRLSVGDHVALVYPPGVDLIAA 
FYGCLYCGCVPVTVRP PHPQNLGTTLPT VKM I VEVS KSACVLTT 
QAVTRLLRSKEAAAA VDIRTWPTI IjDTDDI PKKK I AS VFRPPS P 
DVLAYLDFSVSTTGI LAG VKMSHAATS ALCRS I KLQCELYPSRQ 
IAICLDPYGGLGFALWCLCSVYSGHQSVLVPPLELESNVSLWLS 
AVSQYKARVTFCCYSVMEMCTKGIiGAQTGVLRMKGVNI^CVRTC 

mwaeerp\r I altqs fs klfkdlgi* paravstt fgcrvnvai c 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H==His tidine , I—Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQ^TAGPDPTTVYVDMRALRHDRVRLVERGSPHSLPLMESGKIL 
PGVKVI I AHTETKGPIjGDSHLGE I WVS S PHNATG Y YT V YGEEAL 
HADH FSARLS FGDTQT I WARTG YI»G FLRRTELTDAS GGRHDAL Y 
WGSLDETLELRGMRYHPIDIETSVIRAHRSIAECAVFTWTNLL 
WWELDGLEQDALDIiVAIiVTNWXjEEHYLWGWVI VDPGVI P 
INSRGEKQRMHLRDGFLADQLDPI YVAYNM 


6807 


1444 


606 


VGHDT VHAM FTCFPKCLG FS P P VNVTVS PRS EESHTTT VSGGNG 
.SVFQAGPQLQALANLEARRGSIGAALSSRDVSGLPVYAQSGEPR 
RLTQAQVAAFPGENALEHSSDQDTWDSLRSPGFCSPLSSGGGAE 
S LP PGGPGHAEAGHLG KVCDFHLNHQQPS PTS VLPTE VAAPPLE 
KILSVDSVAVDCAYRTVPKPGPQPGPHGSLLTEGCLRSLSGDLN 
RFPCGME VH S GQRELES WAVGE AMA\l»K F P MGAMS YCLRDRSR 
FLFRIiPMGLSCPLQVQ 


6B08 


2063 


737 


GVGSGAASALARSRPIjASRLSSRRRTRAPRSGAMQRLiAMDIjRML 

srelslyiiehqvrvg f fgsgvgls l i lgfsvayafy yls s iakk 
pqlvtggesfsrflqdhcpwtetyyptvwcwegrgqtllrpf\ 
its kppvqyrneliktadggqi sldwfdndnstcymdastrpti 
lllpgltgtskesyilhmihlseelgyrcwfnnrgvagenllt 
prtyccantedletviiiilvi-isiiypsapfiiaagvsmggmliilnyl 

GKIGSKTPUOAAATFSVGWNTFACSESLEKPIjNWLLFI^YYLTTC 
IfQSSVNKHRHMFVKQVDMDHVMKAKS IREFDKRFTSVMFGYQTI 
DDYYTDASPS PRLKSVG I PVLCLNS VDDVFS PSHAI PIETAKQN 
PNVALVLTS YGGH IG FLEG I WPRQS T YMDR VFKQFVQAM VEHGH 
ELS 


6809 


939 


65 


DYSGQTPVpTEHGNTrLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDSQP LHPSDPTE KQQ PKRLiHVSNI PFR FRD PDLRQMF 
GQFGKI LDVE 1 1 FNERGSKGFGFVTFETSSDADRAREKLNGTI V 
EGRKIEVNNATARVMTNKKTGN P YTNGWKLNP WGAVYG PEFYA 
VTGFPYPTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPPP I PTYG 
AWYQDGFYGAE I \LEATQ PTDTL S PLQRRQPTATVTAESTQLP 
TRTI TPSG PRRPTALE PCETFHRFLLGP 


68lO 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDSQPLHPSDPTEKQQ P KRIjHVSNI P FRFRDPDLRQMF 
GQFGKI LDVE 1 1 FNERGS KGFGF VT FETSSDADRAREKLNGTI V 
EGRKIEVmATARVMTNKKTGNPYTNGWKLNPVVGAVYGPEFYA 
VTGFP YPTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPPPI PTYG 
AWYQDGFYGAE I \LEATQPTDTLS PLQRRQPTATVTAES TQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6811 


1522 


658 


DLVTVWSFVDCRVIASTHGH\KSWVSWAFDPYTTSVEEGDPME 
FSGSDEDFQDLLHFGRDRADSTQCRLSRRNSTDSRPVSVTYRFG 
SVGQDTQLCLWDLTED I LFPHQPLSRARTHTNVMNATSP PAGSN 
GNS VTTPGNS VP P PLPRSNS LPHS AVSNAGSKS SVMDGAIASGV 
SKFATLSLHDRKERHHEKDHKRNHSMGHISSKSSDKLNLVTKTK 
TDPAKTIiGT PLCPRMEDVPLLEPL I C KK I AHERLT VL I FLEDCI 
VTACQEGFICTWGRPGKWSFNP 


6812 


4001 


1682 


EDA VFS LDLSTI I QGTWFLNGEELKSNEPEGQVE PGALR YR IEQ 
KGLQHRLIL1 IAVKHQDSGALVG FS CPGVQDSAALTI QES P VHI L 
SPQDKVSLTFTTSERVVLTCELSRVDFPATWYKDGQKVEESELL 
WKMDGRKHRL I LPEAKVQDSGEFECRTEGVSAFFGVTVQDP PV 
HI VDPREHVFVHAI TS ECVMLACE V \ DR\EDAP VRW YKDGQEVE 
ESDFWLENEGPHRRLVLPATQPSDGGEFQCVAGDECAYFTVTI 
TDVS S W I VY PSG KVYVAAVRLER WLT CELCRP V? AEVRWT KDGE 
EWES PALLI^KEDTVRRLVLPAVQLEDSGEYLCE IDDESAS FT 
VTVTEPPVRI I YPRDEVTLI AVTLECWLMCELSREDAPVRWYK 
DGLEVEESEALVLERDGPRCRLVLPAAQPEDGGEFVCDAGDDSA 
FFTVTVTEPPVQFIiALETTP S PLCVAPGE PVVLS CELSRAGAPV 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Trypfcophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, ! 
\=possible nucleotide insertion) 








VWSHNGRPVQEGEGLELHAEGPRRVIjCIQAAGPAHAGLYTCQSG 
AAPGAPSLS FTVQVAEP P VRWAP EAAQTRVRS TPGGDLE L WH 
LSGPGGP VR W YKDGERI»AS QGR VQLEQAGAR Q VLR VQGARSGDA 
GEYLCDAPQDSRIFX,VSVEEPLLVKLVSDLTPLTVHEGDDATFR 
CEVSPPDADVTWLRNGAWTPGPQRQSCCSYGGCRMCGQRKART 
CVSKWRQAEWVQRGPCAGCEVGSPCPTTLACPWPRMGTSTASSS 
MVSYWPTRAPTAARATTIAPWPGSA 


6813 


9 


836 


SSTQQRPGVPAGPRPLDGYLGVADHKPLKMHCRDCALVTSSGHL 
LHSRQGSQIDQTECV3RMNDAPTRGYGRDVGNRTSLRVIAHSSI 
QRILRWRHDLLNVSQGTVFIFWGPSSYMRRDGKGQVYIWLHLIjS 
QVLPRLKAFMITRHKMLQFDEIiFKQETGQ\NRKISNTWLSTGWF 
TMTIALELCDRINVYGMGPPDFCRDPNHPSVPYHYYEPFGPDEC 
TMYLSHERGRKGSHHRFITEKRVFKNWARTFNIHFFQPDWKPES 
LAINHPENKPVF 


6B14 


3 


73 7 


KFRRQEAN/ARERNRMHGLNDALDNLRKWPCYS KTQKLSKI ET 
LRLAKNYIWALSEILRIGKRPDLLTFVQNLCKGLSQPTTNLVAG 
CLQLNARS FLMGQGGEAAHHTRSP YSTFYPPYHS PELTTPPGHG 
TLDNSKSMKP YN YCSAYES FYESTS PECAS PQFEGPI.S PPPIN Y 
NGIFSLKQEETLDYGKNYNYGMHYCAVPPRGPLGQGAMFRLPTD 
SH FP YDU ILRSQSLTMQDEIaNAVFHN 


6815 


906 


553 


QGLDPASQTKWEIiLKDGSGRRGDRRSSRDMAGGAGPRSESDIjE 
DVGPTAEWNGDGSGSLRRS GS FGKLRDALRRS S EMI*VK KIjQGGT 
PQEP PNPRMKRASSIjNFIjNKS VEE PTQPGG 


saie; 


1 


803 


NLLKTHKF \ LLGQDEDSLHS VPVAQMGNYQE YI*KTLiAS PUtE I D 
PDQPKRLHTFGNPFKQDKKGMMIDEADEFVAGPQNKVKRPGEPN 
SPMSSKRRRSMSLLLRKPQTPPTVTNHVGGKGPPSASWFPSYPN 
LI KPTLVHTDATI IHDGHEEKMENGQI TPDGFLSKSAPSELINM 
TGDLMPPNQVDSIiSDDFTSLSKDGIil QKPGSNAFVGGAKNCS LS 
VDDQKD PVASTLGAMPNTLQ I TPAMAQGINAD I KHQLMKEVRKF 
GRSK 


6817 


172 


3457 


I^M^4DSPKIGNGI J PVIGPGTDIGISSLHMVGYLGKNFDSAKVPS 
DE YCPACKEKG KLKALKTYRI S FQES I FLCEDLQCI YPLGS KSL 
NNLI S PDLEECHTPHKPQKRKSLESS YKDSLLLANSKKTRNYIA 
I DGGKVLNS KHNGEVYDETSS WLPDS SGQQNP I RTADSLERNE I 
LEADTVDMATTKDPATVDVS GTGRPS PQNEG CTSKLEMPLESKC 
TSFPQALCVQWKNAYALrCWIiDCI LSAI»VHSEEbKNTVTGI»CS KE 
ESI FWRLLTKYNQANTLIiYTS QLSGVKDGDCK KLTS E I FAE I ET 
CLNEVRDE I FISLQPQLRCTLGDMES PVFAFPLLLKLETH I EKI» 
FLYSFSWDFECSQCGHQYQNRHMKSLVTFTNVI PEWHPLNAAHF 
GPCNNCNSKSQI RKMVLEKVS P I FMLHFVEGLPQNDLQRYAFHF 
EGCLYQITSVIQYRANNHFITWILDADGSWLECDDLKGPCSERH 
KKFEVPASEIHIVIWERKISQVTDKEAACLPLKKTNDQHALSNE 
KP VSLTS CS VGDAASAETAS VTHPKD I SVAPRTLSQDTAVTHGD 

AENTGI LKTNTLLSQES LMAS S VS A P CNEKL I QDQFVD I S F PSQ 
WNTNMQS VQLNTEDTVNTKS VNNTDATGLIQGVKSVEI EKDAQ 
LKQFLTPKTEQLKPERVTSQVSNLKKKETTADSQTTTSKSLQNQ 
SLKENQKKPFVGSWVKGLISRGASFMPLCVSAHNRNTITDLQPS 
VKGVNNFGGFKTKG INQKASHVSKKARKSASKPPPI SKPPAGPP 
SSNGTAAHPHAHAASEVLEKSGSTSCGAQDNHSSYGNGISSANH 
EDLVEGQIHKI^LKLRKKLKAEKKKLAALMSS PQSRTVRSENLE 
QVPQDGS PNDCES IEDLLNELP YP I DIANESACTTVPGVSI* YSS 
QTHEEILAELLSPTPVSTELSENGEGDFRYLGMGDSHIPPPVPS 
EFNDVSQNTHLRQDHNYCS PTKKNPCEVQPDSLTNNACVRTLNL 
ESPMKTDI FDEFFSSSALNALANDTIiDLPHFDEYLFENY 


6818 


2 


24 0 


RGFDKVLWT/LSGAVK\CVQFSRISPDGEEGYPGELiKVWVTYTIi 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T= Threonine, VWaline, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DGGE/ LHS / ATTEHKP/ VQATPVNLT\TI LTSTWQARItPQI 


6819 


1 


961 


GI PCTEMGNFDNANVTGEI EFAI HYCFKTHSLE ICI KACKNLAY 
GEEKKKKCNPYVKTYLLPDRSSOGKRKTGVQRNTVDPTFQETLK 
YQVAPAQLVTRQI^VSVWHLGTIARRVFLGEVIIPLATWDFEDS 
TTQSFHWHPI^AKADKYEDSVPQSNGELTVRAKLVLPSRPRKLQ 
EAQEGTDQPSLHGQLCLWLGAKNLPVRPDGTLNSFVKGCLTLP 
DQQ KLRLKS PVLR KQACPQWKHS FVFSGVTPAQIiRQS S LELTVW 
DQALFGMNDRLLGGT \ RLGSKGDTAVGGDACSQSKLQWQKVLS S 
PNLWTDMTLVLH 


6820 


1014 


340 


GDMVY IVGHVP PGFFE KTQNKAWFREGFNEKYLKWRKHHRVI A 
GQFFGHHHTDS FRMLYDDAGVP I SAMFITPGVTPWKTTLPGVYN 
GANN PA I R VFE YDRATLS LKDMVTYFMNLS QANAQGT PR WELE Y 
QLTEAYGVPDASAHSMHTVLDRIAGDQSTLQRYYVYNSVSYSAG 
VCDEACSMQHVCAMRQVDI DAYTTCLYASGTTPVPQL PLLLMAL 
LGLCT 


6821 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVHPIQSPQN 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSLI EG Y I \S IVMDAETQKKFPSDLLLTS SSGELWRMVRIG 
GQ P LG FDECG I VAQ I AGPLAAAD I SAYY I STFNFDHALVPEDGI 
GSVIEVLQRRQEGLAS 


6822 


1083 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVHPIQSPQN 
RFCVLTLDPETLP AIATTL I DVLF YSHSTPKEAASSS PE PSS I T 
FFAFSLI EG Y I \SIVMDAETQKKFPSDLLL»TSSSGELWRMVRIG 
GQPIiG FDECG I VAQI AGPLAAADI SAYYISTFNFDHALVPEDGI 
GSVI EVLQRRQEGLAS 


6B23 


654 


221 


PPKLLSRWARMGHGDBIVVLSDLNFPGLLHLPWGPWRSVQTAC 
GIPQLLEAVLKIiliPLDTYVESPAAVMELVPSDKERGLQTPVWTE 
YES ILRRAGCVRAIAKIERFEFYERAKKAFAWATGETAIiYGNL 
IIiRKGVLALNPKL 


6824 


858 


104 


LLLAQRWGWG \ CCFFSLAVS VKMNVLLFAPGLLFLLLTQFGFRG 
ALPKI^ICAGLQWLGLPFLLENPSGYLSRS FDLGRQFLFHWTV 
NWRFLPEALFLHRAFHLALLTAHLTLLLLFALCRWHRTGES I LS 
LLRDPS KRKVPPQPLTPNQIVSTLFTSNFIGI CFSRSLHYQFYV 
WYFHTLPYLLWAMPARWLTHLLRLLVLGLIELSWNTYPSTSCSS 
AALH I CHAVILLQIiWLGPQPFPKSTQHSKKAH 


6825 


3 


1173 


SSGEFGLQASDIMWTISDTGWILIILCSLMEPWAU3ACTFVHLL 
PKFDPLVILKTLS S YP I KSMMG AP I VYRMLLQQDLS S YKF PHLQ 
NCLAGGESLLPETLENWRAQTGLD I RE F YGQTETGLTCMVSKTM 
KI KPG YMGTAAS C YDVQI IDDKGNVL PPGTEGD IG IRVKP IRP I 
GIFSGYVDNPDKTAANIRGDFWLLGDRGIKDEDGYFQFMGRADD 
I INSSGYRIGPSEVENALMEHPAWETAVISSPDPVRGE WKAF 
VIIiALQFLSHDPEQLTKELQQHVKSVTAPYKYPRKI EFVLNLPK 

P FGPLALPMDG YGDSLWE EHEY KFCIiALVI STKLYHVRC 


6826 


2304 


954 


LKTES F KPW/ VN I ALA FHLLG ERAS PNS FWQP Y I QTLPRE YDTP 
L YFEBDE VRYLQS TQAIHDVFSQYKNTARQ YAYFYKVT QTHPHA 
NKLPLKDS FTYEDTOWAVSSVMTRQNQI PTEDGSRVTLALI PLW 
DMCNHTNGL I TTG YNLEDDR CE CVALQD FRAG EQ I Y I FYGTRS N 
AEFVIHSGFFFDNNSHDRVKI KLG VS KS DRL YAM KAEVLARAG I 
PTSS^FALHFTEPPISAQLIAFIjRVFCMTEEELKEHLLGDSAID 

VLKNHDLSVRAKMAI KLRLGEKEILEKAVKSAAVNREYYRQQME 
EKAPLPKYEESNLGLLESSVGDSRLPLVLRNLEEEAGVQDALNI 
REA I S KAKATENGLVNGENS I PNGTRSENESLNQES KRAVEDAK 
GSSSDS TAGVKE 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

lULdLJLOn 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H~Histidine, I=Isoleucine, K=l»ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S= serine, T= Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6827 


1 


779 


SSWEFGLSVLGGLFLLFVLENMLGLLRHRGLRPRCCRRKRRNL 
ETRNLD P ENGSGMAliQ PLQAAPEPGAQGQRE KNSQHP P ALA P PG 
HQGHSHGHQGGTD I TWMVLI<GDGLHNLTDGLA IGAAFS DGFS SG 
LSTTLAVFCHELPHEIiGDFAMLLQSGLSFRRLIXLSLVSGALGL 
GGAVLG VGLS LG P V P LT PWVFG VTAGVFL YVALVDML PALFPS S 
GAPAYA\HVLLQGLGLLLGGCLMLAITLLEERI,LPVTTEG 


6828 


3 


1654 


KSQHG / WI LQLMHSCKEGYVKDLKGNPGLHRAMLDLDNGTRPSE 
I*GHI*SQTAS LKRGS S FQSGRDDTWR YKTPHRVAFVE KliTKLVLS 
QLPNFWKLW I S YVNGSLFSETAEKSGQI ERS KNVRQRQNDFKKM 
IQEVMHSLVKLTRGALIiPLS IRDGEAKQYGGWEVKCELSGQWLA 
HAIQTVRLTHE SLTALE I PNDLLO/TIQDLILDLRVRCVWATLQH 
TAEEIKRLAEKEDWIVDNEGLTSLPCQFEQCIVCSLQSLKGVLE 
CKPGEAS VFQQPKTQEE VCQLS IMI MQVFI YCLEQLSTKPDAD1 
DTTHLSVDVSS PDLFGS IHEDFSLTSEQRLL I VLSNCCYLERHT 
FLN IAEHFEKHNFQG I EKITQVSMASLKELDQRLFEN Y I ELKAD 
P I VGSLE PG 1 YAG Y FDW KDCLP PTGVRN YLKEAI/VN 1 1 AVHAE V 
FTI S KELVPRVLS KVI EAVSEELSR LMQCVSS FSKNGALQARLK 
I CALRDTVAVYLTPES KS SFKQALEALPQLS SGADKKL.L EE LL.N 
: KFKSSMHLQLTCFQAASSTMMKT 


6829 


1 


782 


MRM EAGEAAP PAGAGGRAAGGWG KW VRLN VGGT VFLTTRQTLCR 
EQKS FLS RL CQGEELQSDRDETGAY L> IDRDPT YFGP I IiNFLRHG 
KLVLDKDMAEEGVLEEAEFYNIGPLIRI IKDRMEEKDYTVTQVP 
PKHVYRVLQ CQEEELTQMVSTMSDGWRFEQLVN I GS S YN YGSED 
QAEFZjCVVSKELHSTPNGLSSESSRKTKSTEEQLEEQQQQEEEV 
EEVEVEQVQVEADAQEK/CCYKPEAPGCEAPDHLQGIiGVPI 


6830 


1 


93 9 


MEPGS VENLS I VYRSRDFLWNKHWDVR IDSKAWRETLTLQKQI* 
RYRFPELADPDTCYGFRFCHQLDFSTSGALCVALNKAAAGSAYR 
CFKERR VTKAYIALLRGHIQESRVT I SHAIGRNSTEGRAHTMCI 
EGSQGCEN P KPS LTDLVVLEHGLYAGDP VSKVLLKPLTGRTHQL 
RV\HCSAI/5HPWGDLTYGEVSGREDRPFRMMLHAFYLRIPTDT 
ECVEVCTPDPFDPSLDACWSPHTLIiQSLDQLVQALRATPDPDPE 
DRGPRPGSPSALLPGPGRPPPPPTKPPETEAQRGPCLQWliSEWT 
LEPDS 1 


6831 


3 


1087 


SLF FGS STPDW KVAEQEDI*E TQP S PS VEKAVT V ID PEGT I P TNF 
NVAEKPADHSLSEVKLKTADEPRGTLVKSGDGQNVKEKSMILSN 
VEDLQQPKFI SEVSREDYGKKEI SGDSEEMNINS WTSADGENL 
EIQSYSLIGBKLVMEEAKTIVPPHVTDSKRVQKPAIAPPSKWNI 
SIFKEEPRSDQKQKSLLSFDWDKVPQQPKSASSNFASKNITKE 
SEKPES I ILPVEESKGSLIDFSEDRLKKEMQNPTSIiKISEEETK 
LRSVS PTEKKDNIjENR \ S YTL»\ AE KKVLAE KQNS V\ APLE LRDS 
NEIGKTQITLGSRSTELKESKADAMPQHFYQNEDYNERPKIIVG 
SEKEKDEKKKK 


6832 


1809 


412 


MGSGLISGPPQDNSGEALKEPERAQEHSLPNFAGGQHFFEYLLV 
v»jLiJ\A.JUtoriLUJ i t>lr 1 X 1 y QrPKRbNIiIjRGQQEEEERLLKAIPliF 
CFPDGNEWASLTEYPRETFSFVLTNVDGSRKIGYCRRIjLPAGPG 
PRLPKVY CI 1 SCIGCFGLFSKILDEVEKRHQISMAVT YPFMQGL 
REAAFPAPGKTVTLKSFI PDSGTEFI SLTRPLDSHLEHVDFSSL 
LHCLSFEQILQI FASAVLERXIIFLAEGLSTLSQCIHAAAALLY 
PFSMAHTYIPWPESIXATVCCPTPFMVGVQMRFQQEVMDSPME 
EVLLVNLCEGTFLMSVGDEKDILPPKLQDDILDSLGQGINEIiKT 
AEQ INEHVSGPFVQFFVKI VGHYAS YI KREANGQGHFQERSFCK 
ALTSKTNRRFVKKFVKTQLFSbFIQEAEKSKNPPAGYFQQKIIiE 
YEEQKKQ/TETKGKNCE I RAWNKND 


6833 


1 


1129 


PLMTDSQCGG I PGHGHSHGGHGHGHGLPKG PR VKSTRPGS S DIN 
VAPGEQGPDQEETNTLVANTSNSNGLKLDPADPENPRSGDTVEV 
QVNGNLVREPDHMELEEDRAGQU^MRGVFMVTjGDALGSVIVVV 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
Iocs t ion 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E~ 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=beucine , M=Me thionine , N=Asparagine , 
psProline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NALVFYFSWKGCSEGDFCVNPCFPDPCKAFVEIINSTHASVYEA 
GPCWVLYLDPTLCVVMVCILLYTTYPLLKESALILLQTVPKQID 
IRNLIKELRNVEGVEEVHELHVWQIAGSRI IATAHIKCEDPTSY 
MEVAKTI KDVFHNHGIHATT IQPE FASVGSKSS WPCELACRTQ 
CAbKQCCGTLPQAPSGKDAE KT PAVS I S CLELSNNLE KKPRRTK. 
AENIPA\WIEIKN\IPNK\QPESSL 


6834 


78 


1151 


AGQERPAPIWRLLWLPTPSVSRKAEPAHIPINR*GA*E*RGGLP 
LCGSSASAYGWH* RbTPWSPGGS * HM * SS KAP VTQARE VLVAG P 
CSKLVLSGARGIVGTTVQVLVEAQQPLLLLFTGVWGLNIiRAGEE 
SRAL* LI EE VTQVRDAHLGNAWGCAQCLSQG QVGSALAKALLE 
AAAAVRDCKE VLTVSGDKQQAEVS VRL * VRDVCVEEAGCVEFGQ 
AHGRPGLALAKGRGGTNEVEEQVQVDGVQKLVLSAHECHELVAG 
QQDGEDQAARTRbLQAGAHSVAHGRRQGQAPCRPHQEAGVSCHE 
LQQ WGDAL + ARE * APQ 1 1 VLLLLEDVAQLRTGKKA* DLWDVE 
QLLRQL 


6835 


1 


834 


G I PAADR \ EASLEL I KLDISRTFPNLCI FQQGGP YHDMLHS IbG 
AYTC YRPDVG YVQGMS F I AAVb I LNLDTADAF I AFSNLLNKPCQ 
MAFFRVDHGLMLTYFAAFE VFFEENLPKLFAH FKKNNLT PD I YL 
I DW I FTL YSKSLPLDLACRI WDVFCRDGEEFbFRTALGI LKbFE 
DIbTKMDFIHMAQFLTRLPEDLPAEEbFASIATIQMQSRNKKWA 
QVbTAbQKDSREMREGKSVPPTIiRJbQREFAbGTNQSPMPRPLCC 
FRLTPGQPRRTDAb 


6836 


1 


850 


MSCGRPPPDVDGMITLKV\DNLTYRTSPDSLRRVFEKYGRVGDV 
YIPRE^HTKAPRGFAFVRFHDRRDAQDAEAAMDGAELDGREbRV 
QVARYGRRDLPRSRQGRRHAAGPEAA/RYGRRSRSYGRRSRSPR 
RRHRSRSRGPS CSRSRS RSRYRGS R YSRS P YSRSPYSRSR YSRS 
PYSRSRYRESRYGGSHYSSSGYSNSRYSRYHSSRSHSKSGSSTS 
SRSASTSKSSSARRSKSSSVSRSRSRSRSSSMTRSPPRVSKRKS 
KSRSRSKRPPKSPEEEGQMSS 


6837 


1 


1369 


TOG AAVAGNPGSDY FPGGTAP /GG PRTRRP \ SGTSS SGS KASGP 
PNPPAQGDGTSbSPNYTLESTSGNDGKPVSGGGGRGRGRRKRDS 
GHVS PGTFFDKYSAAPDSGGAPGVS PGQQQASGAAVGGSSAGET 
RG APTPHEKALTS PS WGKGAELLLGDQ PDL I GS LDGGAKSDS S S 
PNVGEFASDEVS TS YANEDEVSS S S DNPQALVKASRS PbVTGSP 
KLPPRGVGAGEHG PKAPP PALGbG I MSNSTSTPDS YGGGGGPGH 
PGTPGIiEQVRTPTSSSGAPPPDE IHPbEILQAQ IQLQRQQFS IS 
EDQPbGLKGGKKGECAVGASGAQNGDS ELGS CCS EAVKS AMSTI 
DLDS LMAEHSAAWYMPADKALVDSADDDKTLAPWEKAKPQNPNS 
KEAHDLPANKASASQPGSHLQCLSVHCTDDVGDAKARASVPTWR 
SLHSDI SNRFGTFVAALT 


6838 


16 


499 


bTDTP PP KTHM I HHS I S DYKATLRC WALGFYPME I TLTW QQDEE 

dqtrdmelvetrpagdgtfqkwaawvpsgee/q/rymchvqhe 
glpepltbrweqssqptipivgivagbvllgawtgawsavmc 

RKKNSDRVS Y ^ EAA5J ^nHAOf? QDV<?T .TAPW 


6839 


1 


1195 


AAPAGGG PDPEALS AFPGRHLSGbS W PQ VKRbDALLSEP IPIHG 
RGNFPTbS VQPRQIRAGGPQHPGGAG \ IHVHRVRLHGSAASHVL 
HPESGLGYKDLDbVFRMDLRSEAS FQb'TKAWbACLbDFbPAGV 
S RAK I T PbTLKEAYVQKbVKVCTDSDRWSLI SbSNKSGKNVELK 
FVDSVRRQFEFS IDS FQI I LDSLLLFGQCSSTPMSEAFHPTVTG 
ESLYGDFTEALEHbRHRVIATRS PEE I RGGGLLKYCHLLVRGFR 
PRPSTDVRALQRYMCSRFFIDFPDLVEQRRTLERYLEAHFGGAD 
AARRYACLVTbHRWNESTVCLMNHERRQTLDb I AALALQALAE 
QGPAATAAbAWRP PGTDGVVPATVNY YVTP VQPLI*AHAYPTWLP 
CN 


6840 


4254 


2061 


EbQGDFSVPDVPKSMAWCENSICVGFKRDYYLIRVDGKGSIKEL 
FPTGKQLEPLVAPLADGKVAVGQDDbTVVLNEEG I CTQKCAIjNW 
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SEQ 
ID 
NO: 



6841 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



3206 



6842 



6843 



6844 



926 



851 



Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V*=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion ) 
TXJIPVAMEHQPPYIIAVLPRYVEIRTF EPRLLVQSIELQRPRFI 



"642- 



— - ***o»ucm vcij^xriif icijjjvyii-LliLiyKPRFI 

TSGGSN1 1 YVASNHFVWRLI PVPMATQIQQLLQDKQFELALQLA 
EMKDDSDSEKQQQIHHIKNLYAFNLFCOKRFDESMQVFAKLGTD 
PTHVMGL YP DLI*PTD YRKQIjQ YPNPLP VLSGAE LEKAHLAL I D Y 
LTQKRSQLVKKUJDSDHQSSTS PLMEGTP TI KS KKKLLQX IDTT 
LLKCYLHTNVALVAPLLRLENNHCHIEESEHVLKKAHKYSELI I 
LYEKKGLHEKALQVLVIX3SKKANSPLKGHERTVQYLQHLGTENIi 
HLI FS YSVWVLRDFPEDGLKI FTEDLPE VES LPRDRVLGFL I EN 
FKGIAI PYLEHI IHW7EETGSRFHNCLIQLYCEKVQGLMKEYLL 
SFPAGKTPVPAGEEEGELGEYRQKLLMFLEISSYYDPGRLICDF 
P FDGLLE ERALLLGRMG KHEQALF 1 YVHI LKDTRMABE YCHKH Y 
DRNKDGNKD VYLSLLRM YLS PPS IHCIX3P I KLELI*E PKANLQAA 
LQVLELHHS KljDTTKAI»Nt*LPANTQIND I RI FIiEKVLEENAQKK 
RFNQVLKNLLHAEFIoRVVQEERILHQQVKCI ITEEKVCMVCKKK 
IGNSAFARYPNGWVHYFCS\KEVNPADT 

TPSTTGTKSNTPTSSVPSAAVTPLNESLQPLGDYGVGSKNSKRA 

REKRDSRNMEVQVTQEMRNVSIGMGSSDEWSDVQDIIDSTPELD 

MCPETRLDRTGSSPTQGIVNKAFG1NTDSLYHELSTAGSEVTGD 

VDEGADrXGEFSGMGKEVGNLLLENSQLLBTKNAIJIVVKNDLlA 

KVDQLSGEQEVLRGEIiEAAKQAKVKLENRIKELEEELKRVKSEA 

IIARREPKEEAEDVSSYLCTESDKIPMAORRRFTRVEMARVXiME 

RNQYKERIjMeIiQEAVRWTEMI RASREHPSVQE KKKST I WQ FFS R 

LFSSSSSPPPAKRPYPSGNIHYKSPTTAGFSQRRNHAMCPISAG 

SRPLEFFPDDDCTSSARREQKREQYRQVREHVRNDDGRLQACGW 

SL PA KYKQLS PNGGQEDTRMKNVPVPVYCRPLVEKDPTMKLWCA 

AGVNLSGWRPNEDDAGNGVKPAPGRDPLTCDREGDGEPKSAHTS 

PEKKKAKELPEMDATSSRVWILTSTIiTTSKVVI I DANQPGTWD 

QFTVCNAHVLCISSI PAASDSDYPPGEMFLDSDVNPEDPGADGV 

LAG 1 TLVGCATR CNVPRSNCSSRGDTPVU)KGOGEVATI ANGKV 

NPSQSTEEATEATEVPDPGPSEPETATLRPGPLTEHVFTDPAPT 

PSSGPQPGSENGPEPDSSSTRPEPEPSGDPTGAGSSAAPTMWLG 

AQNGWDYVHSAVANWKKCLHSIKLKDSVLSLVHVKGRVLVALAD 

GTLAI FHRGEDGQWDLSN YHIjMDLGH PHHS I RCMAWYDR VWCG 

YKNKVHVTQ PKTMQ I EKS FDAHPRRESQVRQIiAW I GDGVWVS IR 

LDSTLRLYHAHTHQHLQDVDIEPYVSKMLGTGKLGFSFVRITAL 

LVAGSRI*WVGTGNG WIS I PIjTETVYLHRGQ\ LLG \ LRANKTS P 

TSGEG\ARPGG\IIHVYG\DDSSDRAARSFIPYCSMAQAQI*CFH 

GHRDAVKFFVSVPGNVIATliNGSVLDS PAEGPGPAAPASEVEGQ 

KLRNVLVLSGGEGYIDFRIGDGEDDETEEGAGDMSQVKPVLSKA 
ERSHI I VWQVSYTPE 

KCUQIiSATILTDHQYXERTP LCAILKQKAPQXjYRIRAKLRSYKP ~ 
RRLFQS VKLHCPKCHLLQEVPHEGDLDI IFQDGATKTPDVKLQN 
TS LYDS KI WTTKNQKGRKVAVHFVKNNGI LPLSNE CLLL I EGGT 
LSEICKLSNKFNSVIPVRSGHEDLELLDLSAPFLIQGTVHHYGC 
KQWST * RS IQNLNSLVDKTS WIPSSVAEALGI VPLQYVFVMTFT 
LDDGTGVLEAYLMDSDKFFQIPASEVLMDDDLQKSVDMIMDMFC 
P PGI KI DAYPWLECFIKS YNVTNGTDNQI CYQ I FDT TVAEDVI 

MHRKVLSGAKRYECNECGKSFAYTSSL IKHRRIHTGERPYECSE 
CGRSFAENSSLIKHLRVHTGERPYECVECGKSFRRSSSLLQHQR 
VHTRERPYECSECGKSFSLRSNLIHHQRVHTGERHECGQCGKSF 
SRKSSh I IHIjR VHTGERP YECSDCGKS FAENSSL I KHLR VHTG E 
RP YECIDCGKS FRHS SS FRRHQRVHTGMRP YK* SKFWKFS CPG F 
LLbCGQRVHTGSRCYECDKWG I FFS*NASFFT* KSAPTEEVPFE 
CNECEKAFSPLSLVTTIFT 



EHQLAGFELRKTQTSMSLGTTREKTDRVKSTAYLSPQELEDVFY 
QYDVKSE I YS FG I VLWEI ATGDI PFQGCNS E KI RKLVAVKRQQE 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=»Alanine, C=Cyateine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, l^Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, **=Stop 
Codon, /=possible nucleotide deletion, 
\=pos3ible nucleotide insertion) 








PLGEDCPSELREIIDECRAHDPSVRPSVDEILKKIjSTFSK*CIK 
1 


6B45 


3 


1519 


VAVRDECYWRHVFWDQDLWMLLFILMCHPETARARLEYRIRTLD 
GALENAQNLGYQGAKFAWESADSGLEVCPEDIYGVQEVHVNGAV 
GLAFELYYHTTQDLQLFREAGGWDWRAVAEFWCSRVEWSPREE 
K YHIiRG VMS PDE YHSG VNNS VYTWVL VQNS LRFAAAIiAQDLGL P 
I PSQWLAVADK I KVP FDVEQNFH P E FDGYE PGEWKQAD WLLG 
YPVPFSLS PDVRRKNLEI YEAVTS PQGPAMTWSMFAVGWMELKD 
AVRARGLIiDRS FANMAE PFKVWTENADGSGAVN FLTGMGG FLQA 
WFGCTGFRVTRAGVTFDPVCLSG I SRVS VSGI FYQGNKLNFSF 
SEDSVTVEVTARAGPWAPHLEAELWPSQSRLSLLPGHKVSFPRS 
AGRIQMSPPKLPGSSS S EFPGRTFSDVRDPLQS PLWVTLGSSSP 
TESLTVDPASE^SGTGASETSLGPSLWPRLHPPLLGTLLACHPS 
PAARIiSGKVHAAWPEFKAFCL 


6846 


213 


1258 


LYFIiKTIK* LNRLAEHP * YENEKLTKLRNTIMEQYTRTEESARG " 
1 1 FTKTRQSAYALSQW ITENEKFAE VGVKAHHLIGAGHS SEFKP 
MTQNEQKEVI S KFRTGKI NLI>IATTVAEEGLDI KECNIVIRYGL 
VTNE IAMVQARGRARADESTYVLVAHSGSGVI EHETVNDFREKM 
M YKAIHCVQNMKPEE YAHKI LELQMQS IMEKKMKTKRNI AKHYK 
NNPSLITFLCKNCSVIiACSGEDIHVXEKMHHVNMTPEFKELYIV 
REN KTLQKKCADYQ INGE 1 1 CKCGQ AWGTMMVHKGIjDIj P CLKI R 
NFVWFKNNSTKKQYKKWVELPITFPNLDYSECCLFSDED 


6847 


14 50 


348 


SMCWNSDRLEMPL1 DLALI LYPPS Y V PYTGHLSDDSLSRKYCLT 
WFEDAIJNGVL*RAEAIQPHCVNAGDRMEKFRQKYWNK1jQTT»RQQ 
PFAYGTLTVRSLLDTREHCLNEFNFPDPYSKVKQRENGVALRCF 
PGVVRSUMXSWEERQLALVKGLLAGNVFDWGAKAVSAVLESDP 
YFGFEEAKRKLQERPWIjVDSYSEWLQRIiKGPPHKCALIFADNSG 
IDI I LGVFP FVRELLLRGTE VlIiACNSGPALNDVTHSES lil VAE 
R I AGMDPWHS AXiREERLIjIjVQTGS S S PCLDLS RliDKGIiAALVR 
ERGADLVVIEGMGRAVHTNYHAAKRCESIjKIiAVI knawlaerlg 
grlfs vi fkyevpae 


6848 


19 


16 


amwwnsldgirnivlsnpkkrntlslamlkslqsdilhdadsnd " 

LKVI 1 1 SAEGPVFSSGHDLKELTEEQGRDYHAEVFQTCSKVMMH 
IRNHPVPVI AMVNGLATAAGCQI.VAS cdiavasdkss fatpgvn 
VGLFCS TPG VALARAVPRKVALEMLFTGEP I S AQEALLHG LLNK 
WPEAELQEETMRIARKIASJjSRPWSIjGKATFYKQIiPQDLGTA 
YYLTS QAMVDNLALRDGQ EG I TAFLQ KRKPVWSHEPV* VEH 


6849 


70 


821 


S1 J GVDGSCLEQGSPAPRPQTDTSP*PVGNWATQQEDLYHQSYEC 
VCVLFASVPDFKEFYSESNINHEGLECLRLLNE I IADFDELLSK 
PKFSG VEKI KT I GST YMAATGLNATS GQDAQQDAERSCSHLGTM 
VEFAVAIaJSKLDVINKHSFNNFRLRVGLNH^ 
YDIWGNTVNVASRMESTGVLGKIQVTEETAWALQSIiGYTCYSRG 
VIKVKGKGQLCTYFLNTDLTRTGPPSATIiG 


6850 


2 


1235 


ARGLNHE WTFE KIjRQHI SRNAQDKQ ELHLFMI*SG V P DAVFDLTD 
LDVIjKLEIjI PEAKI PAKI SQMTNLQEI*HI*CHCPAKVEQTAFS Fb 
RDHLRCLHVKFTDVAEIPAWVYLLKNLRELYIjIGNLNSENNKMI 

gleslrelrhlkilhvksnltkvpsnitdvaphltklvihndgt 
kllvlnslkkmmnvaelelqnceleri phaifslsnlqeldlks 

NNIRTI EEI ISFQHbKRLTCLKLWHNKI VTI PPS ITHVKNLESIj 
YFSNNKLESIjPVAVFSLQKIiRCXiDVSYNNISMIPIEIGLLQNLQ 

hlhi tgnkvdilpkqiifkci klrtlnlgqncitslpekvgqlsq 
ltqlelkgncldrlpaqlgqcrmlkksglwedhlfdtlplevk 
ealnqdinipfangi 


6851 


1765 


660 


VSAQVSAREGENCLGWNLADSSQESYKSIiEEAEDCYPPSLLTLD 
LRDIjFNQVEQGPIjLSCPKAGTDIjSMGRARBVGWMAAGLMIGAGA 

cycvykltigrddsekleeegeeewdddqeldeeepdiwfdfet 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide - 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, . K=I>ysine, 
L-Leucine, M=Methionine, N-Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine , ^Threonine, V-Valine, 
»»— x i yjjuuputiri, i-iyrosine, X= Unknown , *s=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


— 








MARPWt EDGDWTE PGAPGGTEDR PS GGGKANRAHP I KQRP FP YE 
nx\j.M l woMv«i~AJNi3o^v jjUjb&KCljr IQGKIiLFAEPKDAGFPFSQD 
INSHLASLS^lARl^SPTPDPTVREALCAPDNLNASIESQGQIKM 
YINEVCKETVSRCCNS FLQQAGI^LLISMTVINNMLAKSASDLK 
FPLISEGSGCAKVQVLKPLMGLSEKPVLAGEI>VGAQMLFSFMSL 
F I RNGNRE I LLETPAP 




6852 


1 


4 07 


KiKt»iufc.i ilWt ±KHinuvjKn J. I*' YAARTPATLFAVMFAMYI ISGIiT 

GFIGLNS IAVIiCNLVMGLALI FLCTWAYVKYSGEFREIGTVIDQ 

IAETLWEQVLKPLGDNLMEENIRQSVTNSIKAGLTDQVSHHARL 
KTD 




5853 


3 


469 


GDS CAVCIEI*YKPNDI*VRIIiTCNHI FHKTCVDP WI*LEHRTCPMC " 
KCD ILKALG I EVDVEDGSVSLQVPVSNEIFNSASSHEEDNRSET 
AS SG YASVQG TYEPPLEEHVQS TNES LQLVNHEANSVAVDVI PH 
VDNPTFEEDETPNQETAVREI KS 


6854 


1148 


585 


HES Y I GTFD PGELCVCAAI QWLQDNS AS YFLNRKLiVYE PS TQAK 
P VKNT FLRMW I YSHH I YQQDLR KKI IiD VG KRIiD VTGFCMTG KPG 
1 1 CVEGFKEHCEEFWHT I RYPNWKHISCKHAES VETEGNGEDLR 
LFHS FEELLLEAHGDYGLRM5YHMNL»GQFLEFLKKHKSEHVFQI 
LFGIESKSSDS 




6855 


1913 


1148 


GRVGGRVGR I CSPLSGANEYIASTDTIjKTEEVLLFTDQTDDIAK 
EEP TS LFQRDS ETKGE SGIjVLEGDKE I HQI FEDLDKKLALAS RF 
YIPEGCIQRWAAEMWAtDAIiHREGIVCRDIiNPNNILLNDRGHI 
QIiTYFSRWSEVEDSCDSDAIERMYCAPEVGAlTEETEACDWWSI* 
G AVL FELI/TG KTLVECHPAG I NTHTTLNMPEWVS EEARSL I QQL 
LQFNPLERLG AGVAG VED I KSHPFFTPVDWAELMR 


6856 


1617 


• 997 


VTQL YVS VDAS TKDS LKK IDRPLFKD FWQQFLDSL.KALAVKQQR " 
TVYRLrLVKAWNVDELQAYAQLVSLGNPDFIEVKGVTYCGESSA 
S S LTMAHVP WHEE WQFVRELVDLI PEYEIACEHEHSNCLLIAH 
RKFKIGGEWWTl^INYlTOFQEIilQEYEDSGGSKTFSAKDYMARTP 
HWALFGAS ERGFDPKDTRHQRKNKS KAISGC 


6857 


1 


617 


KGPEATAMVCVCSHPNCRQNH1 KPSHSAAQTWCGS PTPASAPNH 
KLMAMEQGKTL PSATEDAKEEGLEAQ I S RIAEL IGRLESKALWF 
DLQQRLSDEDGTNMHLQLVRQEMAVCPEQLSEFLDSLRQYLRGT 
TGVRNCFHI TAVRLSDGFTFVI YEFWETEEAWKRHLQSPLCKAF 
RHVKVDTLSQPEAXiSRII>VPAAWCTVGRD 


6858 


2 


669 


RSRGIKDFENDPPLSSCGIFQSRIAGDALLDSGIRISSVFASPA " 
LRCVQTAKLILEELKLEKK I KIRVEPGI FEWTKWEAGKTTPTU4 
oiA&liJj]U^/\lNi«I>l±LHDYRPAJPL^ 

IVNTCPQDTGVILIVSHGST1>DSCTRPLIjGIjPPRECX5DFAQ1jVR 
K I PS LGMCFCEENKEEGKWELVNP PVKTLTHGANAAFN WRNW I S 
GN 


6859 


1 


1150 


GETMFKKAKTKAKKKPRKRSDSSGGYNLSDIIQSPSSTGLLKSG 
KTNSVESLPELLTSDSEGSYAGVGSPRDLQSPDFTTGFHSDKIE 
AKVKP YVNGTS PVYS REDLKP WEKS P I bKISAPQP I PSNRI DTT 
SSAS WVAGSFSPVS PPWDLRTIME I EESRQKCGATPKSHLGKT 
VSHG VKLSQKQR KM IALTT KENNSGMNS M ETVLFTPSKAPKP VN 
AWASSLHSVSSKSFRDFLLEEKKSVTSHSSGDHVKKVSFKGIEN 
SQAPKIVRCSTHGTPGPEGNHISDLPLLDSPNPWLSSSVTAPSM 
VAPVTFASIVEEELQQEAALIRSREKP1ALIQIEEHAIQDLLVF 
YEAFGNPEEFVIVERTPQGPLAVPMWNKHGC 


6860 


1889 


1515 


DKDKKRQKKRG I FP KVATN I MRAWLFQHLTH P YPS E EQKKQLAQ 
DTGLT I LQVNNWF INARRI I VQPMIDQSNRAVSQG AAYSPEGQP 
MGS FVLDGQQHMG I RPAGPMSGMGMNMGMDGQWHYM 


6B61 


1889 


1515 


DKDKKRQKKRG I FPKVATN IMRAWLFQHLTHPYPSEEQKKQLAQ 
DTGLTI LQVNNWFINARRI I VQPMIDQSNRAVSOGAAYSPEGQP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 1 
nucleotide j 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine/ 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine f R-Arginine, 
S=Serine, T=Threonine, V^Valine, 
W^Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MGSFVLDGQQHMGIRPAGPMSGMGMNMGMDGQWHYM 


6862 


2 


471 


EE I DRE FHNKLKLKEDKLEKQEKPVNGEDKGDSGVDTQNS EGNA 
DE ED PL»G PNCY YDKTKS FFDK I SCDDNRBRR PTW AEERRLNAET 
FG 1 P LR PNRGRGG YRGRGGLG FRGGRGRGGGRGGTFTAPRG FRG 
GFRGGRGGRE FADFEYRKTTAFGP 


6863 


2216 


487 


PQEPALKS E FSQVASNTI PLPLPQ PNTCKDNGPCKQVC S TVGGS 
AI CSCFPG YAIMADGVSCEDQDECLMGAHDCS RRQFCVNTLGSF 
YCVNHTVLCADGYILNAHRKCVDINECVTDLHTCSRGEHCVNTL 
GS FHCYKALTCEPGYALKDGECEDVDECAMGTHTCX3PG FLCQNT 
KGS F YCQARQRCMDG FLQDPEGNC VD I N ECTS I*S EP CRPG FS C I 
NTVGS YTCQRNPL I CARGYHASDDGTKCVDVNECETGVHRCGEG 
QVCHNLPGSYRCDCKAGFQRDAFGRGCIDVNECWASPGRLCQHT 
CENTLGSYRCSCASGFLLAADGKRCEDVNECEAQRCSQECANIY 
GSYQCYCRC^YQLAEDGHTCTDIDECAQGAGIIjCTFRCLNVPGS 
YQCACPEQGYTMTANGRSCKDVDECAIjGTHNCSEAETCHNIQGS 
FRCIiRFECP PNYVQVS KTKCERTTCHDFUECQNS PAR ITHYQLN 
FQTGLIjVPAHIFRIGPAPAFTGDTIALNIIKGNEEGYFGTRRIiN 

aytgwylqravleprdfaldvemki*wrqgs vttflakmhi fft 
tfal 


6864 


2 


2933 


LRDSS PSNLQI I IKELLSMHHQPDPALTKEFDYLPP VDSRSSSG 
FVGLRNGGATCYMNAVFQQLYMQPGLPESLLSVDDDTDNPDDSV 
FYQVQS LFGHLMES KLQY YVPENFWKI FKMWN KEIjYVREQQDAY 
EFFTSLIDQMDEYLKKMGRDQIFKNTFQGIYSDQKICKDCPHRY 
ERE EAFMALNX»GVTSCQSL»E I SIjDQF VRGEVIiEGSN AY Y CEKCK 
EKRI TVKRTCI KSLPSVLVIHLMRFG FDWESGRS I KYDEQI RFP 
WMI»NMEP YTVSGMARQDSSS E VGENGRS VDQGGGGS PRKKVALT 
ENYEL.VGVIVHSGQAHAGHYYSFIKDRRGCGKGKWYKFNDTVIE 
EFDLNDETLEYECFGGEYRPKVYDQTNPYTDVRRRYWNAYMLFY 
QRVSDQNSPVLP KKSRVSWRQEAEDLSLSAPSS PE I SPQS S PR 
PHRPNNDRLS I LTKLV KKGE KKGL FVE KM PAR I YQMVRDENLKF 
MKNRDVYSSDYFS FVLSIiASLNATKliKHPYYPCMAKVSLQLAIQ 
FLFQTYlfRTKKKLRVDTEEW IATIEALLSKSFDACQWLVEYFI S 
SEGREIi X KI FLLECNVREVRVAVAT I IjE KTLDSALF YQDKLKSL 
HQLLEVIjLAIjIjDKDVPENCKNCAQYFFLFNTFVQKQG I RAGDLLi 
LRH S ALRH MIS F LLGASRQNNQ I RRW S S AQARE FGNLHNTVAL.L 
VLHSDVS SQRNVAPG I FKQRPP I S I APS S PLLPLHEE VEALLFM 
SEGKPYLLEVMFALRELTGSLLALIEMVVYCCFCNEHFSFTMLH 
FIKNQLETAPPHELKNTFQLLHEILVIEDPIQVERVKFVFETEN 
GLLAIiMHHSNHVDSSRCYQCVKFLVTIiAQKCPAAKEYFKENSHH 
WSWAVQWliQKKMSEHYWTLQSNVSNETSTGKTFQRTISAQDTLA 
YATALLNEKEQSGSSNGSESSPANENGDRHLQQGSESPMMIGEL 
RSDLDDVDP 


6865 


1820 


1242 


DPERWKHLSKVTPPGSSVSTTPVQWRLQSPQSQGSMMPSCNRS 
CS CSRG PS VEDGKW YGVRS YLHLF YEGYAVPP KLEG I GEGE FIiV 
LDQRAADYNQALGTCRLAGTA1.CVAAGVLLAI CLFWAMIGWLSQ 
DTKAEPLDPEADSHVEVFGDEPEQQDS P I FRNASGQS WFS PPAS 
PFGQSSVQTIQPKRDS 


6866 


1571 


495 


DCPRPRYT LYGLRATCMRDLD WAW INAVSAFKALEQDLPVN I KF 
1 1 EGMEEAGS VALEELVEKEKDRFFSGYD YI VI SDNLW I SQRKP 
AI TYGTRGNS YFMVE VKCRDQDFHSGTFGG I LHEPMADbVALLG 
SIiVDSSGHILVPGIYDEWPLTEEEINTYKAIHLDIiEEYRNSSR 
VEKFbFDTKEEl LMHLWRYPSLS IHGIEGAFDEPGTKTVI PGRV 
IGKFS IRLVPHMNVSAVEKQVTRHIiEDVFSKRNSSWKMWSMTX 
GLHPWIANIDDTQYLAAKRAIRTVFGTEPDMIRDGSTIPIAKMF 
QEIVHKSWLIPLGAVDDGEHSQNEKINRWNYIEGTKLFAAFFL 
EMAQLH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acxd segment containing Signal peptide^ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, k^Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S— Serine, T=Threonine, V=Valine 
W^Tryptophan, Y=Tyrosine, X^Unknown, *=Sfcop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide -in^rhionl 


6867 


2833 


j 1704 


GTRIMSQPKQKELAGFVRQKMLLDYSVYMGRCVPQESRSPQRSP 
LQSAESSPTAGKKLPEVPPSEEEEQEAWVNALLGRIFWDFLGEK 
YWSDL VS KK I QM KLS KI KLP YFMNELTLTELDMGVAVPKI LQAF 
KPYVDHQGLW I DLEMSYNGS FLMTLETKMNLTKLGKEPLVEALK 
VGEIGKEGCRPRAFCLADSDEESSSAGSSEEDDAPEPSGGDKQL 
LPGAEG YVGGHRTS KI MRF VDK I TKS KYFQKATETE F I KKK I EE x 
VSNTPIiLLTVEVQECRGTLAVNIPPPPTDRVWYGFRKPPHVELK 
ARPKLGEREVTLVHVTDWIEKKLEQEFQKVFVMPNMDDVYITIM 
HSAMDPRSTS CLLKDPPVEAADQP 


6868 


1 


346 


R PTR P PTR PEE I KNL I LP Y I SDMNF VQDLCEDFYELF KTDKG FD 
KATFESQMSVMRGQILNLTQALRDGKSPFQLVQIPCVIVERSQG 
GSQGR I VHLSNS FTQTVNCRKPFFSSW 


6869 


3 


1619 


MYMERMDKRALISFWESVEHLKNANKNEIPQLVGEIYQNFFVES ' 
KEISVEKSLYICEIQQCLVGNKGIEVFYKIQEDVYETLKDRYYPS 
FIVSDLYEKLLIKEEEKHASQMISNKDEMGPRDEAGEEAVDDGT 
NQINEQASFAVN KLRELNE KL.EYKRQALNS IQNAPKPDKKI VSK 
LKDE 1 1 L I E KERTDLQLHMARTDW WCENLGMWKAS I TSGEVTEE 
NGEQLPCYFVMVSLQEVGGVETKNWTVPKRLSEFHNLJHRKLSEC 
VPSLKKDQLPSLSKLPFKSIDHTFMEKFENQLNKFLQNLI,SDER 
LCQSEALYAFLSPSPDYLKVIDVQGKKNSFSLSSFLERLPRDFF 
SHQEE ETEEDSDLSD YGDDVDGRKDALAEPCFML I GE I FELRGM . 
FKWVRRTLlALVQVTFGRT INKQIRDTVSWI FSEQMLVYYI NI F- 
RDAFW PNGKLAP PTTI RS KE QSQETKQRAQQKLLEN I PDMLQS L 
VGQQNARHG 1 1 KI FNALQETRANKHLLYALMELLLIELCPELRV 
HLtUyjjKACaQV 


6870 
■ 


1 


1566 


MAAWAATRWWQ LLLVLS AAGMGASGAPQ PPNI LLLLMDDMGWG 
DLGVYGEPSRETPNLDRMAAEGLLFPNFYSANPLCSPSRAAL.LT 
GRLPIRNGFYTTNAHARNAYTPQEIVGGIPDSEQIiLPEIJUKKAG 

yvskivgkwhlghrpqfhplkhgfdewfgspnchfgpydnkarp 

NI PVYRD WEMVGR YYEEFPI NLKTGEANLTQI YLQEALDF I KRQ 

arhhpfflywavdathapvyaskpflgtsqrgrygdavreidds 

I G KI LELLQDLHVADNTFVFFTS DNGAAL I S APEQGGSNGP FLiC 

gkqttfeggmrepalawwpghvtagqvshqlgsimdlfttslal 

AGLTPPSDRAIDGLNLLPTLLQGRLMDRP I FY YRGDTLiMAATLG 

qhkahfwtwtnswenfrogidfcpgqnvsgvtthnledhtki.pl 

* * ""vji^A/irv^i^x^i xr uoi AOKL i VfiurtJjoK J. l £> V V vUHQEAL VPAQP 
QLNVCNWAVMN WAP PGCEKLG KCLTPPES I PKKCLWSH 


6871 

CO')') 


209 


1126 


RMSLNP P 1 FLKRS E ENSSKF VETKQS QTTS IAS ED PLQNLCLAS 
QEVLQKAQQSGRSKCLKCGGSRMFYCYTCYVPVENVPIEQIPLV 
KLPLKIDI IKHPNETDGKSTAIHAKLLAPEFVNI YTYPCI PE YE 
EKDHE VAL I FPGPQS I S I KD I S FHLQKR I QNNVRGKNDDPDKP S 
FKRKRTEEQEFCDLNDSKCKGTTLKKIIFIDSTWNQTNKIFTDE 
RLQGLLQVELKTRKTCFWRHQKGKPDTFLSTIEAIYYFLVDYHT 
D I LKE KYRGQYDNLLFF YS FM YQLI KNAKCSGDKETGKLTH 


DO / £ 


880 


4S9 


FGLLMWLS LI FMKGNCVREDLI FNFLFKLGLDVRETNGLFGNT 
KKLITEYFVRQKYLEYRRIPYTEPAEYEFLWGPRAFLETSKMLV 
LRFLAKLHKKDPQSWPFHYLEALAECEWEDTDEDEPDTGDSAHG 
PTSRPPPR 


6873 


1929 


955 


DEQAVLCS KDKTYDLK IADTSNMLLFI PGCKTPDQLKKEDSHCN 
I IHTE I FG FSNN YWELRRRR PKLKKLKKLIiMENP YEG PDSQKEK 
DSNSSKYTTEDLLDQIQASEEEIMTQLQVLNACKIGGYWRILEF 
DYEMKIJC,!raVTQLVDSESWSFGKVPLNTCLQELGPLEPEEMIEH 
CLKCYGKKYVDEGEVYFELDADKI CRAAARMLLQNAVKFNIAEF 
QEVWG^SVPEGMVTSLt)QLKGLALVDRHSRPEIIFLLKVDDLPE 
DNQERFNSLFSLREKWTEEDIAPYIQDLCGEKQTIGALLTKYSH 
SSMQNGVKVYNSRRPIS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C=Cysteine / D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine,. R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6874 


1 


307 


DS I ADH VNS AAVNVE EGTKNLG KAAKYKLAALP VAGAL I GGMVG 
G P I GLLAGFKVAGI AAALGGG VLGFTGG KL I QRKKQKMME KLTS 
SCPDLPSQTDKKCS 


6875 


1688 


349 


VIGTGERGNSASEKWEIMFNEELGDPFIIIHSISLLNAEEHSIA 
TLLLRIEKEELDMKGSGFYVSLEVJVTISKKNQDNKKYEIIKRDI 
LRGKSVPHYAA1EPDGNGLMIVSYKSLTFVQAGQDLEENMDEDI 
SEKI KEPLY YWQQTEDDLT VT I RLP EDNTKED I Q IQFLPDHINI 
VLKDHQFLEGKLYSS IDHESSTWI IKESNSLE ISLI KKWEGLTW 
PELVIGDKQGELIRDSAQCAAIAERLMHLTSEELNPNPDKERPP 
CNAQELEECD I FFEESSSLCR FDGNTLKTTHWNLGSNQYLFSV 
IVDP KEMPCFCLRHDVDALLWQPHS S KQDDMWEHI ATFNALG YV 
QASKRDKKFFACAPN YS YAALCECLRRVFI YRQPAPMSTVLYNR 
KEGRQVGQVAKQQVASLETNDPI LGFQATNERLFVLTTKNLFLI 
KVNTEN 


6876 


41 


1285 


VGEMTLI WRHLLRPLCLVTS APRILEMHP FLS LGTSRTS VTKLS 
LHTKPRMPPCDFMPERYQVIFLVNSGSFJVNELAMLMARAHSNNI 
DIIS FRGAYHGCSPYTLGLTNVGI YKMELPGGTGCQPTMCPDVF 
RGPWGGSHCRDS PVQT I RKCSCAPDCCQAKDQ Y I EQFKDTLSTS 
VAKS IAGFFAEP IQGVNG WQYPKGFLKEAFELVRARGG VCIAN 
EVQTGFGRLGSHFWGFQTHDVLPDI VTMAKG I GNGFPMAAVITT 
PEIAKSLAKCLQHFNTFGGNPMACAIGSAVLEVIKEENLQENSQ 
EVGTYMLLKFAKLRDEFEI VGDVRGKGLMIGI EMVQDKI SCRPL 
PREEVNQIHEDCKHMGLLVGRGS I FSQTFRI APSMC ITKPE VDF 
AVEVFRSALTQHMERRAK 


6877 


1 


778 


GTS P S PARAYAPPTERKRFYQNVS I TQGEGGFE I NLDHRKLKTP 
QAKLFTVPSEALAIAVATEWDSQQDTI KY YTMHLTTLCNTSLDN 
PTQRN KDQLI RAAVKFLDTDTI CYRVEEPETLVELQRNEWDPI I 
EWAEKRYGVE I SSSTS IMGPS IPAKTRE VLVSHLAS YNTWALQG 
IEFVAAQLKSMVLTLGLIDLRI/TVEQAVLLSRLEEE YQ I QKWGN 
IEWAHDYELQELRARTAAGTLFIHLCSESTTVKHKLLKE 


6878 


931 


263 


QTLQGDFKNRAEMI DFN I RI KNVTRSDAGKYRCEVSAPSEQGQN 
LEEDTVTLEVLVAPAVPS CEVPSSALSGTWELRCQDKEGNPAP 
EYTWFKDGIRLLENPRLGSQSTNSSYTMNTKTGTLQFNTVSKLD 
TGEYSCEARNS VGYRRCPGKRMQVDDLNISG 1 1 AAVWVALVIS 
VCGLGVC YAQRKG YFS KETS FQKSNS SSKATTMSENDFKHTKSF 
II 


6879 


3 


845 


IRVIGESDI MQEFLSESDEN YNGVS DVELRVALPDGTTVTVR VK 
KNSTTOQVYQAIAAKVGMDSTTVl^FALFEVISHSFVRKLAPNE 
FPHKLY I QNYTSAVPGTCLT IRKWLFTTEEEI LLNDNDLAVTYF 
PHQAVDDVKKG Y I KAEEKS YQLQKLYEQRKMVMYLNMLRT CEG Y 
NEI I FPHCACDSRRKGHVITAIS ITHFKLHACTEEGQLENQVIA 
FEWDEMQRWDTDEEGMAFCFEYARGEKKPRWVKIFTPYFNYMHE 
CFERVFCBLKWRKEEY 


6880 


21X0 


14 3 7 


MANI YNEKILKEGNQLTES I F IQNSKLYFFGILFNGLTLGLQRS 
NRDQI KNCGFFYGHRAFSVALIFVTAFQGLSVAFILKFLDNMFH 
VLMAQ VTTV 1 1 TTVS VLVFDFRPSLE FFLEAPS VLLS I F I YNAS 
KPQVPEYAPRQERIRDLSGNLWERSSGDGEELERLTKPKSDESD 
EDTF 


6881 


2638 


2244 


NDSKWEDI HVI -I-GALKMFFRELPEPLFTFNHFNDFVNAI KQEPR 
QRVAAVKDLIRQLPKPNQDTMQILFRHLRRVIENGEKNRMTYQS 
IAI VFG PTLLKP EKETGN I AVHTVYQNQI VEL I LLELS S I FGR 


6882 


1 


850 


GIPEAQLWIYPVKSCKGVPVSEAECTAMGLRSGNLRDRFWLVIN 
QEGNMVTARQBPRLVLISLTCDGDTLTLSAAYTKDLLLPIKTPT 
TNAVHKCRVHGLEIEGRDCGEATAQWITSFLKSQPYRLVHFEPH 
MRPRRPHQIADLFRPKDQIAYSDTSPFLILSEASLADLNSRLEK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F~ Phenyl alanine, G=Glycine, 
H*=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 

P = Prolilie n-miihaminp T? Ar-rrini na 

* ttWA * || wi S<r — oj.ulciuij.xic , Xv — rVXyXUXIlcJ, 

S=Serine TsThrpnniTiP V-Ual ■ino 

'•AilWf A ~~ 111J.CUHJUC t V— ' vcl J> J, Ji(2 | 

W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KVKATNFRPNIVISGCDVYAEDSWDEIiLIGDVELKRVMACSRCI 
LTTVDPDTGVMSRKEPLETLKSYRQCDPSERKLYGKSPLFGQYF 
VLENPGTI KVGDPVYLLGQ 


6883 


2794 


2256 


NS KLKLNQNLKLFI TLTYQ VLS LHGWGPGIHLQKEGAFPVTQNR 
ALQLLYDLR YLNI VLTAKGDE VKSGRS KPDS RI EKVTDHLEAL I 
DPFDLDVFTPHIjNSNLHRLVQRTSVIjFGL'VTGTENQIiAPRSSTF 
iNi»y CirHN J. Lttr LiAoav XKr ULDPlrSMrS TRKAKSTRNIETKAQYD 
ANC 


6884 


2 


99 


EFERVTAEAVKPRETSEPRAAAQRFCEKFPFL 


6885 


297 


1554 


STGQFWHVTDLHLDPTYH I TDDHTKVCASSKGANASNPG PFGDV 
LCDS P YQLILSAFDFI KNSGQEAS FMI WTGDS PPHVPVPELSTD 
TVINVITI^TTTIQSLFPNLQVFPALGNHDYWPQDQLSVVTSKV 
YNAVANLWKP WLDEEA I S TLRKGG F YSQK VTTNPNLR 1 1 SLNTN 
L Y YG PN IMTLNKTDPANQ FEWLESTLNNSQQNKEKVYI I AHVPV 
GYLPSSQNITAMREYYNEKLIDIFQKYSDVIAGQFYGHTHRDSI 
MVLSDKKGSPVNSLFVAPAVTPVKSVLEKQTNNPGIRLFQYDPR 
D YKLLDMLQ Y Y LN t»TE ANLKGES I W KLE Y ILTQTYDI EDLQPES 
LYGLAKQFTIIiDSKQFI KYYNYFFVS YDSSVTCDKTCKAFQ ICA 
IMNLDNI S YADCLKQLY I KHNY 


6886 


2 


1341 


QCGG I PGREGGSSR PLEEGTGSSPACVRGAAPGSEDAFYPTRAK 
OARVSQELKKAAKRTVS IS EGPDTLGDGMRERRETLALAPE PEP 
LEKEACEKWKRPFRSASATSLTLSHCVDWKGLLDFKKRRGHSI 
GGAPEQRYQI I PVCVAARLPTRAQDVLDAHLSEVNAVRFGPNSS 
LLATGGADRLIHLWNVVGS RLEANQTLEGAGGS ITS VDFDP5GY 
QVLAATYNQAAQLWKVGEAQS KETLS GHKDKVTAAKFKLTRHQA 
VTGSRDR rVKEWDLGRAYCSRTINVLS YCNDWCGDHI I ISGHN 
DOKlRFWDSRGPHCTQVIPVQGRVTSIiSLSHDQLHLLSCSRDNT 
LKVI DI»R VSNIROVPRADGFKCGSDWTKAVFS PDRS YALAG S CD 
GALY IWDVDTGKLESRLQGPHCAAVNAVAWCY SGSHMVSVDQGR 
KWLWQ 


6887 


104 7 


116 


WTARPSQKP FW EAGAVPGD PLSTGCS QAQLGG CCPRGP WGPQHG 
GQQRAAG PTLPRGERGGPQQSGPGLAAQTPPTS KQVAWRAFLTG 
TYRSQS PRSPAGP FRGGTG WW PE PAVCLCVAVGPQRIiS S PGLVY 
NASG S EHCYDI YRL YHS CADPTGCGTG PDARAWD YQACTE INLT 
FASNNVTDMFPDL PFTDELRQR YCLDTWG VW PRPDWt»LTSFWGG 
DLRAASNI I FSNGNLDPWAGGGIRRNLSASVI AVTI QGGAHHLD 
LRASHPEDPASWEARKLEATIIGEWVKAARREQQPALRGGPRL 
Sh 


6888 


1. 


992 


FVAYVKKEI PHIWTHCJjLNPHALVI KTLPTKLRDALFTWRVI 
NFIKGRAPNHRLFQAFFEEIGIEYSVLLFHTEMRWLSRGQILTH 
IFEMYEEINQFLHHKSS^VDGFENKEFKIHLAYLADLFKHLNE 
I>SASMQRTGMtnvSAREKLSAFVRKFPFWQKRIEKRNFTNFPFL 
EE 1 1 VSDNEG I F I AAE I TLHLQQLS NF FHG YFS IGDLNEAS KW I 
LDPFLFWI DFVDDS YLMKNDIiAELRASGQ I LMEFETMKLEDFWC 
AQFTAFPNLAKTALE I LMPFATTYDCELGFS ITFTFQNKVPEAA 
LILSDDIRVAISKKVPSFLGHH 


6889 


1 


1534 


LTLENQIKEERF^DNSESPNGRTSPLVSQNNEQGSTLRDIiLTTT 
AGKLR VGS TDAG IAFAPVYSMGAPS SKSGRTMPNI LDDI IAS W 
ENKI PPS KTS KINVKPELKEEPEES I ISAVDENNKLYSDI PHSW 
I CEKHI LWL^YKNSSNWKLFKEa^KQGQPAVVSGVHKKMNISL 
WKAES I SLDFGDHQADLLNCKDS I ISNANVKE FWDGFEEVS KRQ 
KNKSGETWLKLKDWPSGEDFKTMMPARYEDLLKSLPLPEYCNP 
EGKFNLASHLPGFFVRPDLGPRLCSAYGVVAAKDHDI GTTNLHI 
EVS D WNI LVYVG I AKGNG I LS KAG I LKKFE EEDLDD ILRKRLK 
DSSElPGALraiYAGKDVDKIRBFLQKISKEQGLEVIiPEHDPIR 
DQSWYVNKKIjRQRI^BEYGVRTWTLIQFLGDAIVI^AGAIiHQVQ 
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SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutaraic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L-Leucine, M=Methionine, N=sAsparagine, 

* — t j. WJb J \c 'V* fc* CUM -KA (C / n — rlXM JL 11 Xilt! f 

S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








N FHS CI Q VTEDFVS PEHL VES FHLTQELRLLKEE I N YDDKLQVK 
NTLYHAVKEMVRADKIHEDEVDDMEEN 


6890 


3 


667 


THACGMW I PLYLHRALVVHKTAETCNS PPCGAKDSLI FGAITCF 
TGFLGVDTGAGATRWCRLKTQRADPLVCAVGM LGS A I FICLI FV 
AAKS S I VGA Y I CI FVGETLLFSNWAI TAD I LMYWI PTRRATAV 
ALQS FTSHL1UGDAGS PYLIGFI SDLIRQSTKDSPLWEFIiSLGYA 
LMLCPFVWLGGMFFXiATALFFVSDRARAEQQWQLAMPPASVK 
V 


6891 


1980 


1262 


LR I HQELLS KELKLLRG ITI ES 1 1 H I GLAAGKEQFMQDASNVMQ 
LLLKTQSHLYNMEDNNPEVRQAAAYGLGVMAQFGGDDYRSLCSE 
AVPLLVKVIKRAHSKTKKNVIATENCISAIGKILKFKPNCVNVD 
EVLPHWLS PLHEDKEEAI QTLS FLCDL I ESNHP WIG PNNSN 
LPKI ISI IAEGKINETINYEDPCAKRLANWRQVQTSEDLWLEC 
VSQLDDEQQE ALQELLN FA 


6892 


3 


876 


RSVAAASGPGAWGTDHYCLELLRKRDYEGYLCSLLLPAESRSSV 
FALRAFNVELAQVKDSVSEKTIGLMRMQFWKKTVEDIYCDNPPH 
QPVAIELWKAVKRHNLTKRWLMKIVDEREKNLDDKAYRKIKELE 
NYAENTQSSLLYLTLE ILGI KDI*HADHAASHIGKAQG1 VTCLRA 
TP YHGS RRKVFLPMD I CMLHGVSQEDFLRRNQDKNVRDV I YD I A 
SQAHLHLKHARSFHKTVPVKAFPAFLQTVSLEDFLKKIQRVDFD 
I FHPSLQQKNTLLPLYLYIQSWRKTY 


6893 


1 


842 


DGER KSMS VERTFS E INKAEEQYS LCQEI>CS EIiAQDLQ KKRLKG 
RTVT t KLKNVN FEVKTRAST VSS WS TAKE I FAI AKELLKTEID 
ADFPHPLRLRIiMGVRISS FPNEEDRKHQQRS 1 1 GFLQAGNQAIjS 
ATECn-LEKTDKDKFVKPLEMSHKiCSFFDKKRSERKWSHQDTFKC 
EAVN KQS FQTSQP FQVLKKKMNENLE I SENSDDCQILTCPVCFR 
AQGCISLEALNKHVDECLDGPSISENFKMFSCSHVSATKVNKKE 
NVPASSLCEKQDYEAH 


6694 


1742 


1463 


TTLCKPLVPREHQFYETLPAEMRKFTPQYKGKSQLLEGLPHWRG 
DVRDRGHGRPWQPSLEPSI>PPTLCFPSI*SSFSSSWPSAQHI*TPS 
VFNPW 


6895 


2379 


478 


VTY VE LCDLAS PTALI»IMRTVLDL I VEDLQSTSEDKEQQ YTSQT 
TRLLALL YALASHKACKIAILHLINGTI KGDERYAE I FQDLIiAIi 
VRSPGDS VIRQQCVE YVTS I LQS LCDQDI ALI LP S SSEGS I S EL 
EQLSNSLPNKELMTS I CD CLLATIjANSES S YNCLLTCVRTMM FIi 
AEHDYGLFHLKSSLRKNSSALHSLLKRVVSTFSKDTGEIiASSFL 
EFMRQILNSDTIGCCGDDNGIjMEVEGAHTSRTMSINAAELKQLL 
QSKEESPENLFIjELEKLVLEHSKDDDNLDSLLDSVVGLKQMLES 

mwftpfqaeeidtdldlvkvdlielsekccsdfdlhselersf1, 
sepsspgrtkttkgfklgkhkhetfi tssgkseyi epakrahw 
ppprgrgrggfgqg irphdi frqrkqntsrppsmhvddfvaaes 
kewpqdgipppkrplkvsqkissrggfsgnrggrgafhsqnrf 
ftppaskgnysrregtrgsswsaqntprgnynesrggqsnfnrg 

PLPPLRPIiSSTGYRPSPRDRASRGRGGLGPSWASANSGSGGSRG 

kfvsggsgrgrhvrs ftr 


6896 


1 


555 


GNI VIQKKKYNKQH 1 1 PLENVTI DS 1 KDEGDURNGWI* IKTPTKS 
FAVYAATATEKSEWMNHINKCVTDLLS KSGKTPSNBHAAVWVPD 
SBATVCMRCQKAKFTPVNRRHHCRKCGFVVCGPCSEKRFLLPSQ 
SSKPVRICDFCYDLLSAGDMATCQPARSDSYSQSLKSPLNDMSD 
DDDDDDSSD 


6897 


3 


920 


GDGLMH E WNGLMERPDWETAIQKP LCS LPAGSGNALAAS LNHY 
AG YEQ VTNEDLLTNCTLLLCRRIiLS PMNLLS LHTASGLRLFSVL 
SLtAWGFIADVDLESEKYRRI^EMRFTLGTFLRtAALRTYRGRLA 
YLP VG RVGS KT PAS P VWQQG PVDAHLVPLEE P VP S HW TWPD E 
DFVLVLALl^SHrjGSEMFAAPMGRCAAGVMHLFYVRAGVSRAML 



568 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e spon d i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








I^LFIAMEKGRHMEYECPYLVYVPVVAFRLBPKDGKGVFAVDGE 
LMVSEAVQGQVHPNYFWMVSGCVEPPPSWKPQQMPPPEEPL 


6898 


919 


346 


QKTVTAVAS LLKGRQG I YTENERRMG AVI KIR FF K I MLVL 1 1 C W 
LSN I INESLLF YLEMQTDINGGSLKPVRTAAKTTWFIMG1LNPA 
QG FLLS LAFYGWTGCSLGFQS PRKE I QWESLTTSAAEGAHPS PL 
MPHENPASGKVSQVGGQTSDEALSMLSEGSDASTIEIHTASESC 
NKNEGDPALPTHGDL 


6899 


120 


827 


MKVRK1WDAYLLDKNKINMDCFISCFFKKMLTTLMFSHSGILSL 
LEHGEE YTFS I» P CAYARS I IiT VP WVE LGGKVS VNCAKTG YS AS I 
TFHTKP FYGG KLH R VTAE VKHN I TNT WCR VQG EWNS VLE FTYS 
NGETKYVDLTKLA VT K KR VR PLEKQDPFESRRLWKNVTDSLRES 
EIDKATEHKHTLEERQRTEERHRTETGTPWKTKYFIKEGDGWVY 
HKPLWKIIPTTQPAE 


6900 


3 


451 


TE VLGS KGIHELRSSTSALHHALEESASLLTMFWRAALPS THI P 
VLPGKVGESTERELLELRTKVSQQEQLLQSTTEHLKNANQQKES 
MEQ F I V5QLTRTHD VLKKARTNLEVR KLLHQS E APSLS PTHHH P 
LADLVGDSWPALRFQEK 


6901 


1 


201 


DDNM VQR LETD FKMTLQQQS TLEQ WAAWLDNVMMQAL KP YEGRP 
SFPKAARQFLLKWS FYRYHLGFS 


6902 


| 2 ■ 


267 


GAPPPPPSQPPRQPPQAAPSSHPHSDLTFNPSSAIiEGQAGAQGA 
SDMPEPSLDLLPELTNPDELLSYLDPPDLPSNSNDDLLSLFENN 


6903 


1 


14 9 


RINQVYRQGPTG I HI LVI DQMVQNFQDESCFLFSTVKAESSDG I 
HULK 


6904 


464 


2092 


MEASLPVSLSCVIACGDVEGKFDILFNRVOATnVK'Kf;iaT?rvT nn' 
VGNFFGSTQDAE WEE YKTG I KKAP I QT YVLGANNQ ET VKYFQDA 
DGCELAENITYLGRKGIFTGSSGLQIVYLSGTESLNEPVPGYSF 
SPKDVS SLRMMLCTTSQFKGVDILLTS PWPKCVGNFGNSSGEVD 
TKKCGSALVSSLATGLKPR YHFAALEKTY YERLP YRNHI I LQEN 
AQHATR FI ALANVGN PEKKKYLYAFS I VPMKLMDAAELVKQPPD 
VTENPYRKSGQEAS I GKQILAPVEESACQFFFDLNEKQGRKRSS 
TGRDSKS S PHPKQPRKPPQPPGPCWFCLAS PE VE KHL WNIGTH 
CYLALAKGGLSDDHVLILP IGHYQSWELSAEWEEVEKYKATL 
RRFFKSRGKWCWFERNYKSHHLQLQVI PVPISCSTTDDIKDAF " 
I TQAQEQQ IELLE I PEHSDI KQ IAQPGAAYF YVELDTGEKLFHR 
I KKNFPLQFGREVLAS EAI LNVPDKSD WRQCQ I S KEDEETLARR 
FRKDFE P YDFTLDD 


6905 


1 


226 


VS KTGEAET ITSH Y LFALG V YRTL YLFNW I WRYHFEGFFDL I AI 
VAGLVQTVLYCDFFYLYITKVLKGKKLSLPA 


6906 


3 


611 


SYDDHNGHIDFITAASNLRAKMYSIEPADRFKTKRIAGKIIPAI 
ATTTATVSGLVALEM I KVTGG YPFEAYKNWFLNLAI P I WFTET 
TEVRKTKI RNG I S FT I WDRWTVHGKED FTLLDF INAVKEKYG I E 
PTMWQGVKJ^YVP\^PGHAJO?LKLTMHKLVKPTTEKKYVDLTV 
S FAPDIDGDEDLPGP PVRYYFSHDTD 


6907 


o 


2228 


LRGVPVWAAGAFRFSSGEESTSHLIMSRRSQRLTRYSQGDDDGS " 
SSSGGSSVAGSQSTLFKDSPLRTLKRKSSNMKRLSPAPQLGPSS 
DAHTSYYSESLVHESWFPPRSSLEELHGDANWGEDLRVRRRRGT 
GGS ESSRASGLVGRKATEDFLGSS SGY S S BDD YVG YSDVDQQS S 
S S RLRS AVS RAGS LLWMVATS PGRLFRLL YWWAGTTW YRLTTAA 
SLLDVFVLTRRFSSLKTFLWFLLPLLLLTCLTYGAWYFYPYGLQ 
TFHPALVS WWAAKDS RRADEGWEARDSS PHFQAEQRVMSR VHS L 
ERRLEALAAEFSSNWQKEAMRLERLELRQGAPGQGGGGGLSHED 
TLALLEGLVSRREAALKEDFRRETAARIQEELSALRAEHQQDS E 
DLFKKI VRASQESEAR I QQLKS EWQSMTQES FQESSVKELRRLE 
DQLAGLQQE LAALALKQS S VAEE VGLLPQQ I QAVRDDVESQFPA 
WISQFLARGGGGRVGLIjQREEMQAQLREI^SKILTHVAEMQGKS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

1 nr 1 ?! t~ l on 

JIULu Km IwJl 

corresponding 
to first 
amino acid 
residue of 
amino acid 
s equence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E*= 
Glutamic Acid, F=? Phenylalanine, G=Glycine, 
HssHistidine, I=Isoleucine, K= Lysine, 
Ls= Leucine , M-Methionine, N~Asparagine , 
P=Proline, Q -Glut amine , R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 

/-poi)biuj.c nucxcociuc ae ieli on , 
\=possible nucleotide insertion) 








AR EAAASLS LTLQKEGVIGVTEEQ VHH I V KQALQRYS EDR I GLA 
DYALESGG AS VI STRCS ETYETKTALLSLFG I PLWYHSQSPRVI 

T.l^^nX7TJD/ r *Mf* , MJvir , rv™ , T>r\r , 'C , 7V"t7TroT c 7\ r> T o tvt 1 7\ \ rt>r t?Tn;nvA t ct» 
1j v^U VMfc*vji>J LWAr yv> F yv» r A v V Kl»i> AKXKir 1 AVTLtEHV P rvAJUo P 

NST I S SAP KDFAI FGFDEDLQQEGTLLGKFTY DQDGEP I QT FHF 

QAPTMATYQWELRI LTNWGHPE YTCI YRFRVHG EPAH 


6908 


3 


780 


QVPSAAWLMAVCGLGSRliGIiGSRLGIiC^CFGAARIjIiYPRFQSRG 
PQGVEDGDRPQPSSKTPRI PK1 YTKTGDKGFS STFTGERRPKDD 
QVFEAVGTTDEL SSA I G FALELVTEKGHTFAEELQKI QCTLQDV 
GSALATPCSSAREAHLKYTTFKAGP ILELEQW IDKYTSQLPPLT 
AFILPSGGKI S SALHFCRAVCRRAERRVVPLVQMGETDANVAKF 
LNRLSDYLFTLARYAAM KEGNQEKI YKKNDPS AESEGL 


6909 


3 


409 


GRLLAVGTDLYGQRSSAPEQELLVQDATPVSNSLLPEKAFSDIP 
SPYLRGTIKMMQAVRQAFQDQDDRRTWDGRPLTMAATFDDCLYA 
LCWDTI KRS SQTGE WQN I AIMTEE PELS PAYLI S EAMRRS RMS 
LYC 


6910 


1 


1068 


LVPVWIDSYYYGKLVIAPLNIVLYNIFTPHGPDLYGTEPWYFY 
L 1 NGFLNFNVAFALALLVLPLTS LM E YLLQRFHVQNLGHP YWLT 
LAPMYIWFI I FFIQPHKEERFLFPVYPLICLCGAVAliSALQHSF 
LYFQKCYHFVFQRYRLEHYTVTSNWLALGTVFLFGLLSFSRSVA 
LFRG YHG PLDL YPE FYRI ATDPTIHTVPEGRP VNVCVGKE W YRF 
PSSFLLPDNWQLQFIPSEFRGQLPKPFAEGPLATRIVPTDMNDQ 
NLEE PS RY I D I S KCH YLVDLDTMRETPREPKYSSNKEEW I S LAY 
RPFLDASRSSKLLRAFYVPFLSDQYWYVNYTILXPRKAKQIRK 
KSGG 


6911 


1184 


966 


GEDAEEMETGNVANLI S I FGSSFSGLLRKSPGGGREEEEGEESG 
PEAAEPGQI CCDKPVLRDMNPWSTAI VAF 


con 


1 


844 


AMKPVETHSFQMLFTILSTGSALKAQSYEBAYRCI KSSILLGSI 
SGGTDI I SCFMGHNFS LP VYKG EI QARNLGMAVEAWNEEGKAVW 
GESGELVCTKP IPCQPTHFWNDENGNKYRKAYFS KFPGIWAHGD 
YCRINPKTGG I VMLGRSDGTLNPNGVRFGSSE I YNXVESFEEVE 
DSLCVPQ YNKYREERVI LFLKMASGHAFQPDLVKR I RDAI RMGL 
SARHVPSLILETKGI PYTLNGKKVEVAVKQI IAGKAVEQGGAFS 

v* JF1S I XjJJLi x KJJ A Ir&lJjKsb 


6913 


1643 


- 1558 


KKSHEESHKEELSYGAQASLPLPCSDFR ^ 


6914 


1251 


615 


ELAAECKSAGYPGTLIPYRCDLSNEEDILSMFSAIRSQHSGVDI 
LiWNAUliAKPDTl^SGSTSGWKDMFNVNVIiAL I CTREAYQSMK 
ERNVDDGHI IN I NSMSGHRVLPLS VTHFYS ATKYAVTALTEGLR 
QELREAQTHI RATCI SPGVVETQFAFKLHDKDPEKAAATYEQMK 
CLKPEDVAEAVI YVLSTPAHI QIGDIQMRPTEQVT 


6915 


254 


652 


GRSLS FKTFL I WVLI S I YQGG ILM YGALVLFES E FVHWAI S FT 
ALILTELIjMVALT VRTWH WLMWAEFLSLGCYVS S LAFLNE Y FD 
VAF I TTVTFLWKVSA I TVVSCLPLYVLKYLRRKLS P PS YCKLAS 


6 916 


h 254 


652 


GRSLS FKTFL I WVL I S I YQGG I LM YGALVLFES EFVHWAI S FT 
ALILTELLWALTVRTWHWLMWABFLSLGCYVS S LAFLNE YFD 
VAFITTVTFLWKVSA1TWSCLPLYVLKYLRRKLS PPS YCKLAS 


6917 


254 


652 


GRSLS FKTFLIWVLISIYQGGILMYGALVLFESEFVHWAISFT 
ALILTELLMVALT VRTWHWLMWAEFLSLGCYVSS LAFLNE YFD 
VAF I TTVTFLWKVSAI TVVSCLPLYVLKYLRRKLS PPSYCKLAS 


6918 


28 


921 


PEAGTRSWREPDPEDLRRFLLSAACRS FPQWLPGGGGGQVSSCS 
DTDVP YLLLAVKSEPGRFAERQAVRETWGS PAPG I R LLFLLG S P 
VGEAGPDLDSLVAWESRRYSDLLLWDFLDVPFNQTLKDLLLLAW 
LGRHCPTVS FVLRAQDDAFVHTPALLAHLRAL PPAS ARS LYLG E 
VFTQAMPLRKPGGPFYVPES FFEGGYPAYASGGGYVIAGRLAPW 
LLRAAARVAPFPFEDVYTGLCIRaLGLVPQAHPGFLTAWPADRT 
ADHCAFRNLLLVR PLGPQAS IRLWKQLQDPRLQC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 

residue of 
amino acid 
sequence 


Ammo acxd segment containing signal peptide " 
(A=Alanine, C= Cysteine, D=Aspartxc Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H~Histidine, I=Isoleucine, K=Lysine, 
L=Ijeucine, M=Methionine. N=Asparagine, 
P^Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6919 
6920 


850 
1418 


41 
591 


QGRRELSGSVFCPFIQQEPKEMLTI^SEYHERVRSQGQQLQQLQA 
ELDKliHKEVSTVRAANSERVAKLVFQRLNEDFVRKPDYALSSVG 
AS I DLQKTSHD YADRNTA YFWNR FS FWN YARPPTVI LE PHVFPG 
NCTAFEGDQGQWIQLPGRVQLSDITLQHPPPSVEHTGGANSAP 
RDFAVFFLLSFFTHQGLQVYDETEVSLGKFTFDVEKSEIQTFHL 
QNDP PAAFP KVKIQ I LSN WGHPRFTCLYRVRAHGVRTS EGAEGS 
AQGPH 

EAQGPSKVHLTLKKKK — 


6921 


2 


1711 


* u"**A«oii.E»vi **varj±iAty iijKJU^ENYLKEKQLCDVLLiIAGHLRI 
PAHRLVLSAVSDYFAAMFTNDVLEAKQEEVRMEGVDPNALNSLV 
QYAYTGVLQLKEDTIESLIAAACLLQLTQVIDVCSNFI.IKQLHP 
SNCLGIRSFGDACJGCTELLNVAHKYTMEHFIEVIKNQEFI.LLPA 
NE I S KLLCSDDINVPDEETI PHALMQWVGHDVONRQGELGMLLS 

YIRLPLLPPQLLADLETSSMFTGDLECX5KLLMEAMKYHLL*PERR 
SMMOSPRTlCRRTC^TT/niVT .vnyrrMnAMVPiNPT 

v rR ivrnxvoi VLiiUjiMVtJGMUAMKGTTTIEKYDIjRTNSWI* 

HIGTMNGRRLQFGVAVIDNKLYWGGRDGLKTLNTVECFNPVGK 

I WT VMPPMS THRHGLGVATLEG PM YAVGGHDG WS YLNTVER WD P 

EGRQWNYVASMSTPRSTVGVVALNNKLYAIGGRDGSSCLKSMEY 

FDPHTNKWS LCAPMS KRRGGVG VATYNGFLY WGGHDAPASNHC 

SR^DCVERYDPKGDSWSTVAPI^VPRDAVAVCPLGDKJLYVVGG 

YDGHTYI^m^ESYDAQRNEWKEEVPVNIGRAGACVVVVKIiP 


6922 
6923 


1075 


369 


rrrt " XK " i: ' v *^t-'KiiKi!;K£;Kti;KKKlfiKbPijDSTGSELKQNIHSIT 
GLPPAMQKVMYKGLAPEDKTLREIKVTSGAKIMGGGSTINDVLA 
VNTPKDAAQQDAKAEENKKEPLCRQKQHRKVLDKGKPEDVMPSV 
KGAQERLPTVPLSGMYNKSGGKVRLTFKLEQDQLWIGTKERTBK 
LPM G S IKNWS E P I EGHED YHMMAFQDGPTEAS YYWVYWVPTQ Y 
VDAI KDTVLGKWQYF 


6924 


2469 


1660 


i.GLFCILPlDTl.CAVLERDTLSIRESRLFGAWRWAEAECQRQQ 

LPVTPGNKQKVLGKAIjSIjI RFPLMT1 EEFAAGPAQSGI LS DREV 

VWLFIiHFTVNPiCPR VE Y T DR PR CCT T?r , vr?rr> r *ro trnnt m^r,,,-,, 
-»■ "urivTR v& i AiJtv±rit^uljK^ISJiL.<^XWKi*yyVESRWGY 

SGTSDRIRFTVNRRIS I VGFGLYGS IHGPTDYQVNIQI IEYEKK 
QTLGQNDTGFSCDGTANTFRVMFKEPIEILPNVCYTACATLKGP 
DSHYGTKGIiKKWHETPAASKTVFFFFSSPGNNNGTSlEDGQIP 
EIIFYT 


6925 


2210 


1235 


PEERVICFVEYYLTAFHEGRKGALAKKPYNPI IGETFHCSWEVP 
KDRVKPKRTASRSPASCHEHPMADDPSKSYKIiRFVAEQVSHHPP 
ISCFYCECEEKRLCVNTHVWTKSKFMGMSVGVSMIGEGVLRI*LE 
HGEEYVFTLPSAYARS ILTI PWVELGGKVSINCAKTGYSATVI F 
KTKP FYGGKVHR VTAE VKHNPTNTI VCKAHGEWNGTLEFT YNNG 
ETKVIDTTTLPVYPKKIRPLEKQGPMESRNLWREVTRYLRLGDI 
DAATEQKRHLEEKQRVEERKRENLRTPWKPKYFIQEGDGSGILQ 
SPLESTLMGLEVQSFPV 


6926 


2 
1 


1653 
733 


RGGAAGAAMBPDSVIEDKTIELMCSVPRSLWLGCANLVESMCAL 
SCLQSMPSVRCLQISNGTSSVIVSRKRPSEGNYQKEKDLCIKYF 
DQWSESDQVEFVEHDISRMCHYQHGHINSYIJCPMLQRDFITALP 
EQGLDH IAENIL S YLDARS LCAAELVCKEWQRVI S EGMLWKKL I 
ERMVRTDPLWKGLSERRGWDQYLFKNRPTDGPPNS FYRSLYPKI 
IQDIET IESNWRCGRHNLQR IQCRS ENS KGVYCLQ YDDEKI I SG 
LRDNS I KI WDKTS LECLKVLTGHTGS VLCLQYDER VI VTGSS DS 
TVRVWDVNTGEVLNTIjIHHNEAVIjHLR fsnglmvtcs kdrs iav 
WDMAS ATDI TLRR VLVGHRAAVNVVD FDDKYI VSASGDRT I KVW 
STSTCEFVRTLNGHKRGIACLQYRDRI.VVSGSSDNTIRLWDIEC 
GACLR VX.EGHE ELVRCI R FDNKR I VSG A YDGK I KVWDliQ AALD P 

rapastlclrtlvehsgrvfrlqfdefqiissshddtiliwdfi* 
kvppsaqnetrs psrtytyi sr 

sgrvamdglglqfpeqgfpagppj^lpphmgghyrdcqslgappl ' 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
v •*■ *- s^wij- j^ki f r -riieayi,aianiiic , vj=\jj.ycine , 
H=Histidine, l=Isoleucine, K=Lysine, 
I*=:Leucine, M=Metliionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T ^Threonine , V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DGYPIjPTPDTSPLDGVDPDPAFFAAPMPGDCPAAGTYSYAQVSD 

YAGP PEP PAGPMHPRLGPEPAGPS I PGLLAPPSALHVYYGAMGS 

PGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPTPPPEALPCRDGT 

DPSQPAELI.GEVDRTEFEQYLHFVCKPEMGLPYQGHDSGVNLPD 
SHfi A T QQ\A/CTM\ CCD\rvvr , MVDriu 

V VO.L/i-\i>o.r-VV X X V^IN x trUV 


6927 


2 


1484 


LTIjCGD I QLMliAQMANNRAAHLEE FH YQTKEDQE I LHSLHRESS 
CQG FAWATDLS TDLES QLS VSCKC YEAANEI LQFRDLKSQNPEH 
x vyvututntjW iKNit xta V r xTWQAAAIjQSERLVSKSVSAAEQQLW 

KKS FS CFEKGIHNFES i edatnaalllcntgrlmricaqahcga 
gdelkrefspeeglyynkaidyylkalrslgtrdihpavwdsvn 

WBLS TT YFTMATLQX2D YAPLSRKA.QEQXEKEVS EAMMKSLKYCD 

vdsvsarqplcqyraatihhrlasmyhsclrnqvgdehlrkqhr 
vladlhyskaaklfqllkdapcellrvqlervafaefqmtsqns 
nvgkj^ktlsgai^dimvrtehafqliqkelieefgqpksgdaaaa 

ADAS PSLNRBEVMKLLS I FESRLS FLL.LQ S I KLLSSTKKKTSNN 
lEDDTILKTNKHIYSQLLRATANKTATl^ERIl^IVHLLGQLAA 


6928 


1086 


777 


EAIDLIIWLLQVKMRKRYSVDKTl^HPWLQDYQTWLDLRELECK ' 
IGERYITHESDDLRWEKYAGEQGLQYPTIILINPSASHSDTPETE 
ETEMKALGBRVS 1 1* 


6929 


174 9 


607 


RDQRG YRDDRS PAREPGDVSARTRSGGGGGRSATTAMP P P VPNG 
NLHQHDPQDLRHNGNVWAGRPSCSRGPRRAIQKPQPAGGRRSG 
RGPAAGGLCLQP PDGG TCV P EEPPVPPMDWEALEKHLAGLQFRE 
QEVRNQGQARTNSTSAQKNERESIRQKLALGS FFDDGPGI YTS C 
S KSGKPSLSSRLQSGMNLQ I CFVNDSGSDKDSDADDSKTETSLD 
TPLS PMS KQSS S YSDRDTTEEESESLDDMDFIjTRQKKLQAEAKM 
ALAMAKPMAKMQ VEVEKQNRKKS P VADLLPHMPH I S ECLMKRS L 
KPTDLRDMTIGQl^VIVNDLHSQlESIiNEELVQLliLIRDELHTE 
QDAMLVD I EDLTRHAES QQKHMAE KM PAK 


6930 


131 


545 


FKDTAWVFVS L FQMRNNFRH Y FI E PS QLKLFYDVT TWI VT^VAI 
S YTWP FVLLS I KPSLTFYS S WYYCXHILGILVLLLLPVKXTQR 
R KNTHENI QLS QS KKFDEGENSLGQNS FSTTNNVCNQNQE IASR 
HSSLKQ 1 


6931 


2 


6S9 


FVERLPNRPACLLVASGAAEGVSAQS FLHCFTMAS TAFNLQVAT 
fKj\jru\i v in.t* vuv i t* b-M AK WVQDrRIjKA YASPAKLES I DGAR YHAL 
L I PSCPGALTDLAS SGSLAR I LQHFHSES KP I CAVGHGVAALCC 
ATNEDRS WVFDS YS LTGPS VCELVRAPGFARLP LWEDFVKDSG 
ACFSASEPDA VHWLDRHLVTGQNAS S TVPAVQNLL FLCGSRK 


6932 


2 


1131 


FVDSPGQGEQAEEEEGGIQMNSRMRAHS PAEGASVES SSPGPKK " 

»v^c\j«v^i^oJa*\Av?iit*i3X AoMJJi^rSXKYVSHQHPSHPQLFSIVR 
QACVRSLS CEVCPGREGPIFFGDEQHGFVFSHTFFI KDSLARGF 
QRWYS 1 1 TIMMDR I YL INS WPFLLGKVRG I IDELQG KALKVFEA 
EQFGCPQRAQRMNTAFTPFIiHQRNGNAARSLTSLTSDDNLWACL 
HTSFAWLLKACGSRLTEKLLEGAPTBDTLVQMEIOADLEEESES 
WDNSEAEEEEKAPVLPESTBGRELTQGPAESSSLSGCGSWQPRK 
LPVFKSLRHMRQVGGRGTAHHELRRRANHGLCLPTRLASGPSTL 
KTLQEVTDSLLGGWLMAQGVGGI I 


6933 
~6934 


1431 


890 


SLNx^CTLPPPPHQYPAGYPSDKEGKKPKGOSKKQPSGTTKRPI 
SDDDCPSASKVYKASDSAEAIEAFQLTPQQQHLIRBDCQNQKLW 
DEVLSHI»VEGPNFLKK1jEQSFMCVCCQEIjVYQPVTTECFHNVCK 
DCLQRS FKAQVFSCPACRHDLGQNYIMI PNEILQTLLDLFFPGY 
SKGR 




3030 


2588 


DRDHSQCGGIRRVAIARVSSVKLISKAKIRTVKMTFIIVIiAFlV " 
CWTPFFFVQMWSVWDANAPKEASAFI I VMLLASLNSCCNPWIYM 
LFTGHLi FHEIj VQRFIjCCSAS YliKGRRLGETSAS KKSNSSS FVLS 
HRSSSQRSCSQPSTA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K- Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, ! 
W^Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6935 


886 


S43 


NSALYVAGGNDGTSCLNSVERYS PKAGAWES VAPMN I RRSTHDL 
VAMDGWLYAVGGNDGS SSLNS IEKYNPRTNKWVAASCMFTRRS S 
VGVAVLELLNFPPPSS PTLSVSSTSL 


6936 


1347 


567 


RSHRRQFLSRALLEFFGKSHPPPHRLFRKSLNVGLHYSHIPFLT 
TCLHFLRKRLQKGEVGLSVETSKPQVPVGGLSRKKVPQEPWATV 
ME KRLQE AQLYKEEGNQRYREG KYRDAVS R YHRALLQLRGLDP S 
LPSPLPNLGPOJ3PALTPEQENILHTTQTDCYNNLAACLLQMEPV 
NYERVREYSQKVLERQPDNAKALYRAGVAFFHLQDYDQARHYIili 
AAVNRQPKDANVRRYLQLTQSELSSYHRKEKQLYLGMFG 


6937 


1 


727 


AVEFRCCPGRDPACFARGWRLDRVYGTCFCDQACRFTGDCCFDY 
DRACPARPCFVGEWSPWSGCADQCKPTTRVRRRSVQQEPQNGGA 
PCPPLEERAGCLEYSTPQGQDCGHTYVPAFITTSAFNKERTRQA 
TSPHWSTHTSDAGYCMEFKTBSLTPHCALENRPLTRWMQYLREG 
YTVC VDCQ P PAMNS VSLRCSGDGLDSDGNQTLHWQAI GNPRCQG 
TWKKVRRVDQCSCPAVHSFI FI 


6938 


3 


719 


NSRKbEIiAERVDTDFMQLKKRRQSSEKENDSGTLDTVGAVVVDH 
EGNVAAAVSSGGLALKHPGRVGQAALYGCGCWAENTGAHNPYST 
AVSTSGCGEHLVRTI LARECSHALQAEDAHQALLETMQNKF1SS 
PFLASEIX3VTjGGVIVI*RSCRCSAEPDSSQNKQTLI*VE FLWSHTT 
E S MCVG Y M S AQ DGKAK.TH I S RL P P G A VAG QS VAI EGG VCRLGE P 
SELTLQAECEASQRHFRT 


6939 


3 


810 


kvtaprrpqryssghgsdnssvlsgelppamgrtalfhhsggss 
G YESIjRRDS eatgs as sapdsms esgaas pgartrs lks p kkra 
tglgrrrli paplpdttalgrkps l pgq wvdlp pplags lkep f 
e i kvyeiddverlqrprptpreaptqglacvstrlrlaerrqqr 

LREVQAKHKHLC^IiAETQGRLMLEPGRWLEQFEVDPELEPESA 
E YliAALERATAAlEQCVNLCKAHVMMVTCFDISVAASAAI PGPQ 
EVDV 


6940 


1188 


496 


GKMAAQPIiRHRSRCATPPRGDFCGGTERAlDQASFTTSMEWDTQ 
VVKGSSPLGPAGLGAEEPAAGPQLPSWLQPERCAVFQCAQCHAV 
LADS VHIiAWDLSRSLG AVVFSRVTNNWLEAPFliVGI EGSLKGS 
TYNLLFCGSCGIPVGFHLYSTHAALAALRGHFCLSSDKMVCYLL 
KTKAIVNASEMDIQNVPLSEKIAELKEKIVLTHNRLKSLMKILS 
EVTPDQSKPEN 


6941 


1 


713 


SLSRADSDPHGPHTCGHVLNVI IGSNVLALAEAQRQAEALGYQA 
VVLSAAKC<3DVKSMAQFYGLLAHVARTRLTPSMAGASVEEDA0L 
HELAAELQ I PDLQLEEAiETMAWGRGPVCLliAGGEPTVQLQGSG 
RGGRNQELALRVGAELRRWPLGPIDVLFLSGGTDGQDG PTEAAG 
AWVTPELASQAAAEGLDIATFLAHNDSHTFFCXrLQGGAHLLHTG 
MTGTNVMDTHLLFLRPR 


6942 


1 


246 


GD YVERYDP KTDTWTMGAPLSMP TWAVGGCLLGDRL YADGG YDG 
QTYLOTMESYDPOTNEVrTQMASIiNIGRAGACVVVIKQP 


6943 




739 


PMATGDGAKTLAI HVKAIjTADS I R I TWKATLPAS S FRLS WLRLG 
HS PAGGS ITETLVQGDKTE YLLTALE PKPTY I ICMVTMETTNAY 
VADETP VCAKAETADS YGPTTTLNQEQNAGPMASLPIiAGI IGGA 
VALVFLFLVLGAI CWYVHQAGELLTRERAYNRGSRKKDDYMESG 
TKKDNS ILE IRGPGLQMLP INPYRAXEEYWHTI FPSKGSSLCK 
ATHTIGYGTTRGYRDGGIPDIDYSYT 


6944 


960 • 


156 


VANIliLNGVKYES ELTGS S ERAEQ PLS VGRLCST I CNM P KALRT 
LCVNHFLGWI^FEGMIiLFYTDFMGEVVFQGDPKAPHTSEAYOKY 
NSG VTMGCWGM CI YAFSAAFYSA I LE KLEEFLS VRTLYFI AYLA 
FGLGTGLATI*SRNLYVVLSLCITYGIIiFSTLCTLPYSLLCDYYQ 
SKKFAGSSADGTRRGMGVDI SLLSCQ YFLAQI LVSLVLGPLTSA 
VGSANGVM Y FS SLVSFLGCIiYSSLFVI YE I P PS DAADEEHRPIiL 
LNV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sp on di ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide ~ 
<A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v^Valine, 
W^Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6945 


2067 


179 


EGEDRGLPRTMGAALGTGTRIAPWPGRACGALPRWTPTAPAQGC 
HSKPGPARPVPLKKRGYDVTRNPHLNKGMAFTLEERLQLGIHGL 
I-PPCFLSQDVQLLR IMRYYERQQSDLDKYI I LMTLQDRNEKLFY 
RVLTS D VE KFM P I V YT P T VGLACQH YGLTFRR PRG LF I T I HDKG 
HIATMLNSWPEDNIKAVWTDGERILGLGDIJGCYGMGI PVGKLA 
LYTACGG VNPQQCLP VLLDVGTNNEELLRDPL Y I GLKHQRVHGK 
A YDDLLDEFMQAVTDKFGI NCLI QFEDFANAKAFRLLNK YRNK Y 
CM FNDDIQGTASVAVAG ILAALRITKNKLSNHVFGFQGAGEAAM 
G\lAHLLVMALE\KEGVPKA\EATRKIW\MVDF\KGriIVQGRDH 
LNHEKEMFAQD\HPEVWSLEEWRLVKPTAIIGVAAIAEA\FTE 
QILRDMASFHERP\IIFALSNPTSKAECTA\EKCYRVTEGPRGF 
FAS \ GS PF * G VL I WEMGKTFI PGGRGNNA *R VPRG WQLGVHSPG 
GD PGH I P \ DE I FLPDS RAKLPQEVS EQHLSQGRL Y P\ PLST\ IR 
NVFI*RIAIKVFD*GYKHNLV\SYYPEPKD\KEAFCKIPGSYTPD 
YDS FYT/VDS Y I WAQGKAMNVQTV 


6946 


" 133 


2551 


SCEYSGITVAPGDPCPGVAHIiLAPSMASDTPESLMALCTDFCIiR 
NLDGTLG YLLDKETLiRLH P D I FLPS E I \ CDRLVNE Y VELVNAAC 
NF\EPHE\SFFNPLFRDPRKQPASRRIHL\RED\liVQD\QD\LE 

airkqdl\vel\yltn\ceklsaksi^u?sfsht'lgvp*affg 
c\tnilllrkenpggl/cedeylfnptcqvlvkdftfegfsrlr 
f\lklgrmidwvpves\llrplnslaaldlsgiqtsdaa\fltq 

WKDSlt\VST»VT.\ VMMni.Qnr»TlTD\ \7TT7/-\T UVT niJT nTfnnnT r*/~. 

YYKFKLTR E VLS LFVQKLGNLMS LD I SG \HM I LENCS I S KI GKR 
EAGQTS I\EPSK\SSIIP FRGFEGG PLQF \LG VF * G I FCGRLTH 
I P AY KVSGDKNEEQVLNAI EA YTEHRPE ITSRAINIjIjFD IAR IE 
RC^QLLRALKLVI TALKCHKYDRNI QVTGSAALF YLTNSE YRS E 
QS VKXRRQ VI Q WLNGMES YQE VTVQRNCCLTL CNFS I PE ELEF 
QYRRVNELLLS ILNPTRQDES IQRIAVHLCNALVCQVDNDHKEA 

vgkmgfv^mijcltqkklldktcdqvmefsw\sai,wnitdetpd 
ncemflnfngmkiifldclnefpekqeiihrnmlgllgnvaevkei, 

RPQLMTSQF I S VFSNLLES KADG IEVS YNACG VLS H IMFDGPEA 
WGVCEPQREEVEERMWAAIQSWDINSRRNINYRSFEP I LRLLPQ 
G I S P VSQHWATWALYNLVS VY PDKYCPLLI KEGGM PIi LRD I 1 KM 
ATARQ ETKEMARKVT EHCSNFKEENMDTSR 


6947 


2 


1682 


TS^STIPRGLASARPQSRSWRCCPVWRRSPGRARGRGLiKMLNVP 
SQSFPAPRSQQRVASGGRSKVPLKQGRSLMDWIRLTKSGKDLTG 
LKGRL IEVTEEELKKHNKKDD CWI C I RGFVYNVS P YME YHPGGE 
DELMRAAGSDGTELFDOVHRWVNYESMLKECLVGRMAI KPAVLK 
DYREEEKKVmGMbPKSQVTDTLAKEGPSYPSYDWFQTDSLVTI 
/EHIY*TEGYQFRLNNS*SSE*FLYSRNNY*GLLISYTYW/R*A 
MRFRKI FLCGI*/ CESVGKIEI VLQKKENTSWDFLGHPLKNHNSI* 

iprkdtglyyrkcqliskedvthdtrlfclmlppsthlqvpigq 

HVYLKLPITGTEIVKPYTPVSGSLLSEFKEPVLPNNKYIYFLIK 
IYPTGLFTPELDRLQIGDFVSVSSPEGNFKISKFQEIiEDLFLLA 
AGTGFTPMVKIIiNYAIiTDI PSIiRKVKLMFFNKTEDDI I WRSQLE 
KLAFKDKRLDVEFVLS AP I S EWNGKOGH I S PALLS E FLKRNLDK 
SKVL VC I CGP VPFTEQG VRLLHDLN FS KNE I HS FTA 


6948 


104 


58 


PDGAHSFFPDEYFTCSSLCLSCGVGCKKSMNHGKEGVPHEAKSR 
CRYSHQYDNRVYTCKACYERGEEVSWPKTSASTDSPWMGLAKY 
AWSGYVIECPNCGVVYRSRQYWFGNQDPVDTWRTEIVHVWPGT 
DGFLKDNNNAAQRLLDGMNFMAQSVS ELSLGPTKAVTSWLTDQI 
APAYWRPNSQILSCNKCATSFKDNDTKHHCRACGEGFCDSCSSK 
TRPVPERGWGPAPVRVCDNCYEAR/TRPVSCYRGTSGR*RRRRT 
QETVE 


6949 


152 


4656 


GLRLCLSRPLTRPGDDSVGGSAMASGAGGVGGGGGGKI RTRRCH 
QGPIKPYQO^RQQHQGII^RVTESVKNIVPGWLQRYFNKNEDVC 
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1 SEQ~ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide™" 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^isoleucine, K^Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P= Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
X^posaible nucleotide insertion) 


j 6950 






S CS TDTS E V PRWP EN KEDHL VYADEESSNITDGR I TPEPA VSNT 
EEPSTTSTAST\YPDVI>TRVSLYRSHLNFSMLESPALHCQPSTS 
SAFPIGSSGFSLVKEIKDSTSQHDDDNISTTSGPSSRASDKDIT 
VSKNTSLPPLWSPEAERSHSLSQHTATSSKKPAFNLSAFGTLSP 
SLGNSS ILKTSQLGDS PFYPGKTT YGGAAAAVRQSKLRNTP YQA 
PVRRQMKAKQLSAQSYGVTSSTARRILQSLEKMSSPLADAKRIP 
S I VS S PLNS PIoDRSG I D I TDFQAKRE K VDSQYP P VQRLMTPKPV 
S IATNRSVYFKPS LTPSGEFRKTNQRI DKKCSTGYEKNMTPGQN 
REQRESG FS Y PNFS LPAANG LSSGVGGGGGKMRRERHAFVAS KP 
LEEEEMEGPVLPKISI>PITSSSLPTFNFSSPEITTSSPSPINSS 
QALTNKVQMTS PS S TGS PM FKFSS P I VKS TEANVLPPS S I GFTF 
S VP VAKTAELSGS S STLEP 1 1 SSS AHHVTTVNS TNCKKT P PEDC 
EGPFRPAEILKEGSVLDILKSPGPAS PKIDSVAAQPTATSPWY 
TRPAISSFSSSGIGFGESLKAGSSWQCDTCLLQNKVTDNKCIAC 
QAAKLSPRDTAKQTGIETPNKSGKTTLSASGTGFGDKFKPVIGT 
WDCDTCLVQNKPEAI KCVACET PKPGTCVKRALTLTvVs ESAET 
MTAS SSS CTVTTGTLG FGD KFKR PIG S WECSVCCVSNNAEDNKC 
VSCMSEKPGSSVPTSSSSTVPVSLPSGGSLGLEKFKKPEGIWDC 
ELCLVQNKADSTKCLACESAKPGTKSGFKGFDTSSSSSNSAASS 
SFKFGVSSSSSGPSQTLTSTGNFKFGDQGGFKIGVSSDSGYINP 
MSEGF*FSIQ^IVGFKFGVSSESKPEPVFfTcnc;w>rnMT?Trcv-»T oe-^t 

SNPVFLTPFQFGVSNLGQEEKKEELLKSSCAGFRFGTGVINSTR 
VPANTIVTSENKSSFNLGTIETKSVSVAPLKCQTSEAKKEEMPA 
TKGGFSFGNVEPASLPSASVFVLGRTEEKQQEPVTSTSLVFGEG 
KLTMKE PKC\ QP VF S FGEFQRQTKDENS S KSTFS FS MTKPS E KE 

SEQPAKATFAFGAQTNTTADQGAAKPDLSYLNNSSSSSSTPATS 
AGGG \ I FGSSTSSSNPPVATFVFGQSSNPGSSS \AFGNTAESST 
SQS LLPS QDSKLATTS STGTAVTPFVFGPGASSNNTTTSGFG FG 
ATTTS S S AGSS FVFGTGPS AP S ASPAFGANQTPTFGQS QGAS Q P 
NPPGFGSISSSTALFPTGSQPAPPTFGTVSSSSQPPVFGQQPSQ 
SAFGSGTTPNSSSAFQFGSSTTNFNFTONSPSGVFTFGANSSTP 
AASAQPSGSGGFPFNQSPAAFTVGSNGKNVFSSSGTS FSGRK I K 
TAVRRRK 


r 6951 


2585 


411 


PRPGSRSG JjCRRAGERGAVRAGGLSRRTRAE * IMDELHYQDTDS" 
DVPEQRDSKCKVKWTHEEDEQLRALVRQFGQQDWKFIASHFPNR 
TDQQCQ YR WLRVLNPDLVKGP WTKEEDQKVI EIiVKKYGTKQWTI* 
IAKHLKGRLGKQCRERWHNHLNPEVKKSCWTEEEDRI ICEAHXV 
LGNRWAEIAKMLPGRTDNAVKNHWNSTIKRKVDTGGFLSESKDC 
KPPVYLLLEbEDKDGLQSAQPTEGQGSLLTNT^PSVPPTIKEEEN 
SEEErAAATTSKEQEPIGTDLDAVRTPEPLEEFPKREDQEGSPP 
ETStiP YKWWEAANLIj I PAVGSSI>SEAI»DL I ESDPDAWCDLSKF 
DLPEEPSAEDSINNSLVQLQASHQQQVLPPRQPSA\LVPSVTBY 
RliDGHT I S DLSRS S RGEI*I P I S PS TE VGGSG IGTPPS VLKRQRK 
RR VALSPVTENSTSLS FLDSCNS LTP KS TPVKTLPFSPSQFLNF 
™ ^V**- 7 x i^-^fco u I i> TP VCSQKVVVTTPLHRDKTPLHQKHAAF 
VTPDQKYSMDNTPHTPTPFKNALEKYGPLKPIiPQTPHLEEDLKE 
VLRSEAGIELI XEDD I RPEKQKRKPGLRRS P I KKVRKSLALDI V 
DEDMKLMMSTLPKSLSLPTTAPSNSSSIiTLSGIKEDNSLLNQGF 
LQAKPEKAAVAQKPRSHFTTPAPMSSAWKTVACGGTRJDQLFMOE 
KARQLLGRLKPSHTSRTLILS 




1940 


239 

] 


AGPDDTMKRS IjQAL YCQLIiSFXiL I liALTEALAFA IQE PS PRESL 
QVLPSGTPPGTMVTAPHSSTRHTSWMLTPNPDGPPSQAAAPMA 
TPTPRAEGHPPT\TPSPPSLRQ*PPPILKAP/SSTGPAPAAMAT 
rs S KPEGRPRGQAAPT I LLTKP PGATS RPTTAP PRTTTRRPPRP 

PGSSRKGAGNSSRPVPPAPGGHSRSKEGQRGRNPSSTPLGQKRP 
LGKIFQIYKGNFTGSVEPEPSTLTPRTPLWGYSSSPQPQTVAAT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=,Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, P=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, (^Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R»Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TVPSNTSWAPTTTSLGPAKDKPGLRRAAQGGGSTFTSQGGTPDA 
TAASGAPVSP/PSCPSAFSAPPPR*PTGWPQP**LI*AYCYP\CT 
SRPLSTSSGVFTAATGPTPAAFDTSVSAPSQGIPQGASTTPQAP 
THPSRVSESTISGAKEETVA\PSP*PTGCPVLSPQWYPQPQAIS 
STAWSPPGPGSLGQQGTSPMWPRGTNRSTEPPSA*ARWISPG*S 
WPSACPSPP\LCPADGVLHEEEEEDRQPGEQPEAYGNNTHHPGT 
TFQQAC\RGAAPGEI PVPLKPLRTQLSEPRS PANGDYRDTGMVP 
C 


6952 


658 


[ 304 


PESEGESGEMTDRYTIHSQJjBHLQSKYIGT\ATPTPPSGSG\CE 
PTPRLVLLLHGPLRPSQLLRHCGE * EQSAS PLQLDGKDAS ALWT 
ASRQARGELRLCLTTAVRGTSPSVSPVCQSS 


6953 


1512 


349 


NWGKTRAliASGKH VP FGKQTNPNKS / VHCDS *G* *RRETTQDES 
PS PHFRGKMGGW\ KliEKELENTEQPVGGNEG * EHEVTGWLNSD 

pllelcqcplcqldcgsreqliahvyqhtaawsaksymXcpvc 

UiVniMOXrUO JJVKnJLMJ J. tl£? CtUUKoiM VLAjAK Jt t oHA 1 FNSEKLP 

evlnmesi>ptvhnegpssaegkdiafsppvypagillvcnncaa 
yrklleaqtpsvrkwalrrqneplevrlqrlerertakksrrdu 
etp e erevrrmrdre akrlqrmqe tdeqrarrlqrdreamrl.kr 
aietpekrqapxirereakrlkrrlekmdmmlraqfgqdpsama 
alaaemnffqlpvsgveldsqllgkmafeeqnssslh 


6954 


819 


1 1 


pp p p f 1 1 pshpreagt * ag * krsgds ecs p p veq * a* traaaqn 

* pqr * rwtegns pqasavatpgqgas paaprcrp* psrrhrrlp 
pgarppag*aapaptkpwlagpasapqpgaaplsppapplirtr 

* cagaaargr pr rdrs pr prtpggcs wseprt p pavs asaqtps 

u ™ >*w»k w i^y.tcyKfc'i5 itjR* PPGVGGAGRSHRREGTIPGNPHPR 
AS*RAGWQR*PGP/REWGL+EPOGEEMSGPGGPGGAPPNQVGSS 
VMQAMSTGI 


6955 


1968 


782 


PPGRRQVRAQVAGAPVGHWGTRARQVKTGGRRRARRTMPFLGQD* ' 
WRSPGWSWIKTEDGWKRCESCSQKI^ERENNHCNISHSIIIJISED 
GE I FNNEEHE YAS KKRKKDHFRNDTNTQS FYREKWI YVHKESTK 
ERHGYCTIiGEAFNRLDFSSAIQDIRRFNYWKLLQLIAKSQLTS 
LSGVAQKN YFWI LI>KZ VQKVI»DDHHNPRI*I KDI*LQDIiSSTLCI L 
/N*RSREVCISGKHQYLDLPIRNYSRIATTATGSSDD*ASE\NG 
LTLSDLPLiHMLNN I L YR FSDGWDI I TLGQVTPTL YMliSEDRQIiW 
KKLCQYHFAEKQFCRHLILSEKGHIEWKLMYFALQKHYPAKEQY 
GDTIiHFCRHCS IL F WKDSGHPCTAADPDS CFTPVSPQHFI DLFK 
F 


6956 


8605 


3839 


QTSTS I FAS PTS P P VLGES VLQDNS FDI>NNGSDAEQEEMETQS S " ' 

DFPPSLTQPAPDQSSTIQLHPATSPAVSPTTSPAVSLWSPAAS 

PEISPEVCPAASTWSPAVFSWSPASSAVLPAVSLEVPLTASV 

TS PKAS P VTS PAAAFPTAS PANKD VS S FLETTADVEE ITGEGLT 

ASGSGDVMRRRIATPEEVRLPLQHGWRREVRIKKGSHRWQGETW 

YYGPCGKRMKQFPEVIKYIiSRNWHSVRREHFSFSPRMPVGDFF 

EERDTPEGLQWVQLSAEEIPSRIQAITGKRGRPRNTEKARTKEV 

PKVKRGRGRPPKVKI TELLNKTDNR PLKKLEAQETLNEEDKAKI 

AKSKKKMRQKVQRGECQTTIQGQARNKRKQETKSLKQKEAKKKS 

KAEKEKGKTKQEKLKEKVKREKKEKVKMKEKEEVTKAKPACKAD 

KTLATQRRLEERQRQQMILEEMKKPTEDMCLTDHQPLPDFSRVP 

GLTIiPSGAFSDCIiTIVEFLHSFGKVLGFDPAKDVPSLGVLQEGL 

LCQGDSLGEVQDLLVRLLKAALHDPGFPSYCQSLiKILGEKVSEI 

PLTRDNVSEIIiRCFLMAYGVEPALCDRLRTQPFQAQPPQQKAAV 

LAFbVHELNGSTLI INEIDKTLESMSS YRKNKWI VEGRLRRLKT 

VLAKRTGRS EVEMEGPEECLGRRRSSR IMEVTSGMEEEEEEES I 

AAVPGRRGRRDGEVDATAS S I PELERQ IEKLS KRQL FFRKKItLH 

SSQMLRAVSLGQDR YRRR YWVLP YLAGI FVEGTEGNLVPE E V IK 

KETDSLKVAAHASLNPALFSMKMELAGSNTTASS PARARGRPRK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K- Lysine , 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TKPGSMQ PRHLKS P VRGQDS EQ PQAQLQ P EAQLHA P AQPQPQLQ 
LQLQS HKG FLEQEG S P LSLGQS QH DLSQSAFLSWLS QTQSHS S L 
LSSSVLTPDSSPGKLDPAPSQPPEEPEPDEAESSPDPQALWFNI 
SAQMPCNAAPTPPPAVSEDQPTPSPQQIASSKPMNRPSAANPCS 
PVQFSSTPLAGLAPKRRAGDPGEMPQSPTGLGQPKRRGRPPSKF 
FKQMEQRYLTQLTAQPVPPEMCSGWWWIRDPEMLDAMLKALHPR 
G I REKALHKHLNKHRD FLQEVCLRPS ADP I FEPRQLPAFQEG IM 
SWSPKEKTYETDLAVLQWVEELEQRVIMSDLQIRGWTCPSPDST 
REDLAYCEHLSDSQEDITWRGRGREGLAPQRKTTNPLDLAVMRL 
AALEQRVTBRRYLREPLWPTHEVVLEKALLSTPNGAPEGTTTEIS 
YEITPRIRVWRQTLERCRSAAQVCLCLGQLERSIAWEKSVNKVT 
CLVCRKGDNDEFLLLCDGCDRGCHIYCHRPKMEAVPEGDWFCTV 
CLAQQVEGEFTQKPGFPKRGQKRKSGYSLNFSEGDGRRRRVLLR 
GRES PAAG PR YS E EGLS PS KRRRLSMRNHHS DLTFCE 1 1 LMEME 
SHDAAWPFLEPVNPRLVSGYRRIIKNPMDFSTMRERLLRGGYTS 
SEEFAADALLVFDNCQTFNEDDSEVGKAGHIMRRFFE\SRWEEF 
YQGKQGQ S VRQGRWGVTLWHLP PT FQTKTCHFHLLML P WVQTQV 
RYNPDF 


6957 
69S8 " " 


82 


3S14 


HLIVAMPEPTKKEENEVPAPAPPPEEPSKEKEAGTTPAKDWTLV 
ET P PGEEQAKQNANSQLS I LF I EKPQGGTVKVGEDI TFI AKVKA 
EDLS E KPTI NGS R KWMDLASKAGKHLQLKETFERHS RVYTFEMQ 
IIKAKDNFAGNYRCEVTYKDKFDSCSFDLEVHESTGTTPNIDIR 
SAFKRSGEGQEDAGELDFSGLLKRREVKQOEEEPOVDVWELLKN 
TKPSEYEKIAFOYESPTCSGMLKRLKRSIREEKKSAAFAKILDP 
VY QVDKGGRVRF WELAD P KLEVKWN KNGQELRP STKY I FEDTR 
CQSILNIDNCQMTDDSEYYVTAGDEKCSTELLVREPPIMVTKQIi 
EDTTDYCGERVELECEVSEDDAQVKWFKNGEEIILVQTRYRIRV 
EGKKH I LI I EGATKADAAD YS VMTTGGQSS AKLS VDLKPLKI LT 
PLTDQTVNLGKEI CLKCE I SENI PGKWTKNGLPVQESDRLKWH 
KGRIHKIiVIDHALTEDEGDYVFAPDAYNVTLPAKVHVIDPPKI I 
LDGLDADNTVTVIAGNKLRLEIPISGEPPPKAMWSRGDKAIMEG 
SGRIRTESYPDSSTLVIDIAERDnsnVYHTMT.irKrc , 2ir , T??vijnoTv 

VKWDFPDPPVAPTVTEVGDDWCIMNWEPPAYDGGSPILGYFIE 
RKKKQSSRWMRLNFDLCKETTFEPKKMIEGVAYEVRIFAVNA\I 
GISKPSMPSRPFVPIAVTSPPTLLTVDSVTDTTVTMRWRPPDHI 
GAAGLDG YVLE YCFEGS TSAKQSDENGEAAYDLPAEDW I VANKD 
LIDKTKFTITGLPTDAKIFVRVKAVNAAGASEPKYYSQPILVKE 
IIEPPKIHSPKHLKQTYIRRVGDRVILVTPFQGKPRPELTWKKD 
GAE IDKNQINIRNSETDTI I FIRKAERSHSGKYDLQVXVDKFVE 
TAS I D I R I IDRPG P PQ I VK I ED VWGRNVALTWTPPKDDGNAAI T 
GYTIQKADKKSMEWLRVIEHIIEPVPHTELVIGNEYYFRVFSEN 
MCGLSEDATMTKESAVIARDGKIYKNPVYEDFDFSEAPMFTQPL 
VNRLCHSGYWIATLNCSVRGNPKPKITWMKNKVAIVDDPRYRMFS 
NQGVCTLE I RKPS P YDGGTYCCKAVNDLGTVEI ECKLEVKV IAQ 


6959 


274 
1 


lOOJ 

1469 " 


PRTSRVKTEGSQGSSAMDFS VKVDIEKEVTCP I CLELLTE PLSL 
DCGHSFCQACITAKIKESVIISRGESSCPVCXJTRFQPGNLRPNR 
HLANI VERVKE VKMS PQEGQ KRD VCEHHG KKLQI FCKEDGXVT C 
WVCELSQEHQGHQT FRINE WKECQEKLQVALQRL I KENQEAEK 
LEDDIRQERTAWKNYIQIERQKILKGFNEMRVILDNEEQRELQK 
LEEGE VNVLDNLAAATDQLVQQRQDAS TLI S DLQRRLRGS S VEM 
LQDVIDVMKRSESWTLKKPKSVSKKLKSVFRVPDLSGMLQVLKE 
LTDVQ Y YW VDVMLN PGSATSNVAI S VDQRQVKT VRTCT FKNSNP 
CD FSAFG VEGCQ YFS SGKYYWE VD VSGKI AWILGVHS KI SS LNK 
RKSSGFAFDPSVKYSKVYSRYRPQYGYWVIGLQNTCEYNAFEDS 
SSSDPKVLTLFMAV\LPWLGFS 

SL VHWEFGRGI EDFP YLFFQLTHCQQR I CS VTQAG VQWCDHSS 
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SEQ 
ID 
WO: 


Predicted 

beginning 

nucleotide 

location 

c orre sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H-Histidine, I=Xsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, WValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown f *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQPQTPGLNQSSHLSLLSSRDYRMLSSFNEWFWQDRFWLPPNVT 
WTBLEDRDGRVYPHPQDLLAAIiPLALVIjIiAMRLAFERFIGLPIiS 
RWLGVRDQTRRQVKPNATLEKHFbTEGHRPKEPQLSLLAAQCGL 
TLQQTQRWFRRRRNQDRPQI/TKKFCEASWRFLFYLSSFVGGLSV 
LYHESWLVJAPVMCWBRYPNQI*T1»SCPAADSEA\SLYWWYIiLELG 
FYLSLLIRLPFDVKRKGGGPSSIKPRPHYDPPSTA\DFKEQVIH 
HFVAVIIjMTFSYSANIiLRlGSLVIjLLHjDSSDYLIiEACKMVNYMQ 
YQQV CDALFL I FSFVFFYTRI1VLFPTQI LYTTYYES I SNRGPFF 
GYYFFNGIiliMLIiQLLHVFWSCLILRMriYSFMKKGQMEKDIRSDV 
EESD S SEE AAAAQE PLQLKNGTAGG PR PAPTDGPRSRVAGRLTN 
RHTTAT 


6960 


387 


2068 


AKWARE KEMQE F \TRS FF\RGRPDLSTLTHS I VRRRY LAHSGRS 
HLEPEEKQAl/KRLVEEEPLKMQVDEAASREDKLDIiTKKGKRPPT 
PCSDPERKRFRFNSESESGSEASSPDYFGPPAKNGVASRSHTHP 
KEENPRRA\ SKAVEES SDEERQRDIiPAQRGEESSEEEEKGYKGK 
TRKKPWKKQAPGKASVSRKQAREESEESEAEPVQRTAKKVEGN. 
KGTKSIiKESEQESEEEILAQKKEQREEEVEEEEKEEDEEKGDWK 
PRTRSNGRRKSAREERSCKQKSQAKRLIX5DSDSEEEQKEAASSG 
DDSGRDREPPVQRKSEDRTQLKGGKRLSGSSEDEEDSGKGEPTA 
KGS RKMARLG STSGEES DLERE VS DS EAGGG PQGERKNRSS KKS 
SRKGRTRS SS SS SDGS PEAKGGKAGSGRRG EDHPAVMRLKRY IR 
ACGAIIRNYKKLLGS CCSHKERLS ILRAELEAIXBMKGTPSLGKCR 
ALKEQREEAAEVASLDVANI I SGSGRPRRRTAWNPLGEAAPPGE 
LYRRTLDSDEERPRPAPPDWSHMRGIISSDGESN 


6961 


340 


1646 


R P WS S PTMK PN FS L»RLR I FNLNCW G I P Y I*S KHRADRMRRLGDFL 
KQES FDLALLEEVWSEQDFQYLROKLS PTYPAAHHFRSGI IGSG 
LCVFS KHPI QELTQHI YTLNGYP YMIHHGDWFSGKAVGLLVLHI, 
SGMVLNAYVTHLHAEYNRQKDIY1AHRVAQAWELAQFIHHTSKK 
ADWIiLCGDLNMHPEDLGCCLIiKEWTGIjHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQELKPFPFGVRIDYVLYKAVSGFYISCKSFET 
TTGFDPHRGTPIjSDHEAIjMATLFVRHSPPQQNPSSTHGPXAERS 

pl/mcvci>kealdgsiigiigma\qarwwa\tfa\syviglgl\r»l 
iallcvlaagggageaaillwtpsvglvlwagafylfhvqevng 
lyraqaelqhvlgrareaqdlgpe pqlyaliAlgqqegdrtkeq 


6962 


340 


1646 


RPWSSPTMKPNFSLRLR I FNLNCWGI PYLS KHRADRMRRLGDFL 
NQES FDLALLEEVWSEQDFQ YLRQKLS PTYPAAHHFRSGI IGSG 
LCVFSKHPIQEIiTQHIYTLNGYPYMtHHGDWFSGKAVGLLVLHL 
SGMVLNAYVTHLHAEYKRQKDI YIjAHRVAQAVIEIiAQFI HHTS KK 

advvllcgdltmhpedlgcciilikewtglhdayletrdfkgseeg 
ntmvpkwcyvsqqelkpfpfgvridyvlykavsgfyiscksfet 
ttgfdphrgtplsdhealmatlfvrhs p pqqn ps sthgp \ aers 
pl/mcvciikealdgslglgma\qarwwa\tfa\syviglgl\ltj 

LALLCVIAAGGGAGEAAILLWTPSVGLVLWAGAFYIjFHVQEVKG 


6963 


374 


2618 


RVTPL ILKLIjKKPKTAENQKASEENE itqpggssakpglpclwf 

eaviis pdpalihsthsiitnshahtgs sdcdis ckgmterihs in 
lhnfsnsvleti^eqpj^rghfcdvtvt^ihgsmlraqrcvlaags 
pffqdklllgysdieipswsvqsvqklidfmysgvlrvsqsea 
lqiltaasilqiktvidectrivsqnvgdvfpgiqdsgqdtprg 
tpesgtsgqssdtesgylqshpqhsvdriysalyacsmqngsge 
rsfysgawshhetalglprdhhmedpswitrihersqqmeryl 
s ttpetthcrkqprpvriqtltvgnihi kqemeddyd y ygqqrvq 
ilerneseectbdtdqaegtesepkgesfdsgvss s igtepds V 
eqqfgpgaardsqaeptqpeqaaeapaeggpqtnqletgasspe 
r s ne vemds tv i tvsnssdks vlqq ps vnts i gq p l p s tql ylr 
qtetltsnlrmpltltsntqvigtagntylpalfttqpagsgpk 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cyateine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V= Valine, 
W=Tryptophan, Y~Tyrosine, X -Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PFLFSLPQPLAGQQTQFVTVSQPGIjSTFTAQLPAPQPLASSAGH 
STASGQGEKKPYECTLCNKTFTAKQNYVKHMFVHTGEKPHQCSI 
CWRS FSLKD YL I K\ HMVTHTGVRA YQCS I CNKR FTQKS S LNVHM 
RLHRGEKS Y EC Y I C KKKFS HKTLLER HVALHS ASNGTP PAGT P P 
GARAG PPG WACTEGTTY VCS VCPAKFDQ I EQFNDHMRMHVS DG 


6964 


1 


178 ' 


SGRPFFFFFSNTDVYFIKKVTNRWTAGSSYKMTRMKSIGKIIiLL 
QI FIG\NCSMFVIiVI 


6965 


757 


208 


SGSLGYNLPQNH\GLLGRNTLVLLGQMRRISPFLCLKDRSDFRF 
PQEKVEVSQLQKA\QAMSFLYDVLQQVFNFSHKAIiL\CCMEHDL 
PGPTPHFTSSAAGTPGDLLGAGDGRRRSWGQWVIEGSTLALRRY 
FQESISTLE 


6966 


820 


1867 


IITAIiGVRGMPGCPCPGCGMAGPRLLFLTALALELLGRAGGSQP" 
ALRS RGTATACRLDNKES ES WGALLS GERLDTW I CS LLGS LMVG 
LSGVFPLLVI PLEMGTMLRSEAGAWRIiKQLLS FALjGGLLGNVFI* 
HLLPEAWAYTCS AS PGGEGQSLQQQQQLGLWVI AGI LTFLALEK 
/HVPGQQGGGDQPGPQQRPHCCCRRAQWRPLSGPAGCRARPRCR 
GP\DIKVSGYLNLLANTIDNFTHGLAVAASFLVSKKIGLLTTMA 
I LLHEI PHE VGDFAI LLRAG FDRWS AAKLQLS TALGGLLG AG FA 
ICTQS PKGVEETAAWVLP FTSGGFLYIALVNVbPDLLEEEDPW 


6967 


162 


633 


G F1>P FKYWI LDLS AS SRM ETDCNPMELS SMSGFEEGSEIiNG FEG 
TDMKDMRIjEAEAVVNDVLFAVNI^FVSKSLRCADDVAYINVETK 
ERNRYCLELTEAGLKWGYAFDQVDDHLO/TPYHETVYSLIjDTL\ 
SPAYREAFGKR \LLQRLEALKRDGQS 


6968 


1 


2265 


RGGGGGRGGPGARERERPGEPERTMEAAAGGRGCFQPHPGLQKT 
LEQFHLSSMSSLGGPAAFSARWAQEAYKKESAKEAGAAAVPAPV 
PAATEPPPVLHLPAIQPPPPVLPGPFFMPSDRSTERCETVLEGE 
TISCFWGGEKRLCLPQI LNS VLRDFSLC3QINAVCDELH I YCSR 
CTADQLEI LKVMGI LPFSAPSCX3LITKTDAERLCNALLYGGAYP 

S PSAACXQCLD \ CRLM YP PHKFWKSH KAXENRTCHWGF \ DS A\ 
NWRAYI LLSQDYTGKEEQARiGR \ CLDDVKEKFD YGNKYKRRVP 
R VSS E P PAS I R P KTDDTSSQS PAPS EKDKPSS WLRTLAGS S NKS 
LGCVHPRQRLSAFRPWSPAVSASEKELSPHLPALIRDSFYSYKS 
FETAVAPNVALAPPAQQKVVSS PPCAAAVSRAP E PLATCTQPRK 
RKLTVDTPGAP ETLAP VAAP EEDKDSEAEVEVE S REEFTS S I*S S 
IiSSPSFTSSSSAKDLGSPGARAIiPSAVPDAAAPADAPSGLEAEl* 
EHLRQALEGGLDTKEAKE KFLHEWKMRVKQEE KLS AALQAKRS 
LHQELE FLRVAKKE KLREATEAKRNLRKE IERLRAENEKKMKEA 
NESRLRLKRELEQARQARVCDKGCEAGRIiRAKYSAQIEDLQVKL 
QHAEADREQLRADIjIjREREAREHLEK\VVK\ELQEQLWPRARPE 
AAGSEG\AAELEP 


6969 


1855 


118 


AGTMHGRLKVKTSEEQAEAKRIiEREQKLKLYQSATQAVFQKRQA 
GELDESVLELTSQILGANPDFATLWNCRREVLQQLETQKSPEEIj 
AALVKAELGFLES CLR VNP KS YGTWHHRCWLLGRLP E PNV7TREL 
ELCARFLEVDERNFHCWDYRRFVATQAAVPPAEELAFTDSLITR 
NFSNYSS WHYRS CLLPQLH PQPDSG PQGRLPED VLLKELEliVQN 
AFFTDPNDQSAWFYHRWLLGRADPQDALRCLHVSRDEACLTVSF 
SR PLLVGSRME I LLLMVDDS PLI VE WRTPDGRNRPSHVWLCDLP 
AAS LNDQLPQHT FRVI WTAGD VQKE CVLIiKGRQEGWCRDS TTDE 
QLFRCEI^VEKSTVI^SELESCKELQELEPENKWCLNLTIILLM 
RALDPLLYEKETLQ YFQTLK\ AWD P KRATY \ LDDLRSKFLLENS 
VLKMEYAEVRVLHI^KDLTVI^HLEQLLLVTHLDLSHNRLRTL 
P PALAAIjRCLEDPP PRT\ VLQASDNAI ESLDG VTNLPRLQEIjLI* 
CNNRLOQPAVLQPIiAS CPRIiVIjLNLQGNPLCQ AVG I LEQLAELL 
PSVSSVLT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor r e spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Ess 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=lsoleucine , K=Lysine, 
L=Leucine, Ms=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyroeine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6970 


3 


1528 


S FPPLLSS PSAVGEGKVAVAAPCPGRSECARAXMA Y I QLE PLNE 
GFLSRISGLLLCRWTCRHCCQKCYESSCCQSSEDEVEILGPFPA 
QTP PWLMASRSS DKDGDSVHTASEVPLTPRTNS PDGRRSSSDTS 
KS T YS LTRR I S SLE SRRPSS PL I D I KP I E FGVLS AKKEP I QPS V 
LRRTYNPDDYFRKFEPHLYSLDSNSDDVDSLTDEEILSKYQLGM 
LIIFSTQYDLLHNHLTVRVIEARDLPPPISHDGSRQDMAHSNPYV 
KI CLLPDQKNSKQTGVKRKTQKPVFEERYTFE I P FLEAQRRTLL 
LTWDFDKFSRHCVIGKVSVPLCEVDLVKGGHWWKALIPSSQNE 
VELGELLLS LNYLPSAGRLNVDVI RAKQLLQTD VSQGSD P FVKI 
QLVHGLKLVKTKKTSFLRGTIDPFYNESFSFKVPQEELENASLV 
FTVFGHNMKS SNDFIGR I VIG\QYS SGP \ SE PNHWRRMLNTHRT 
AVEQWHSLRSRAECDRVSPASLEVT 


6971 


37 


3702 


ACF YVPGSRS FKL*I PRHGLVNMGRSGKLPSG VSAKLKRWKKGHS 
SDSNPAICRHRQAARSRFFSRPSGRSDLTVDAVKLHNELQSGSL 
RLGKS E AP ET PMEEEAELVLTEKSSGTFLSGLS DCTNVT FS KVQ 
RFWESNSAAHKE I CAVLAAVTEVI RSQGGKETETEYFAALIRKA 
AQHGVCSVLKGSEFMFEKAPAHHPAAISTAKFCIQEIEKSGGSK 
EATTTLHMLTLLKDLLPCFPEGLVKSCSETLLRVMTLSHVLVTA 
CAMQAFHSLFHARPGLSTLS AELNAQ 1 1 TALYD YVPSENDLQPL 
IAWLKVMEKAHINIiVRLQWDLGLGHLPRFFGTAVTCLLSPHSQV 
LTAATQSLKEILKECVAPHMADIGSVTSSASGPAQSVAKMFRAV 
EEGLTYKFHAAWSSVLQLLCVFFEACGRQAHPVMRKCIiQSLCDL 
RLS PHFPHTAALDQAVGAAVTSMGPE VVLQAVPLE I DGSE ETLD 
FPRSWLLPVIRDHVQETRLGFFTTYF1 J PLANT3 J KSKAMDLAQAG 
STVE S K I YDTLQ WQMWTLLPGFCTRPTDVAI S FKGLARTLGMAI 
SERPDLRVTVCQALRTLITKGGQAEADRAEVSRFAKNFLP I LFN 
LYGQPVAAGDTPAPRRAVLETIRTYLTITDTQLVNSLLEKASEK 
VLDPASSDFTRLSVLDLWAIiAPCADEAAISKLYSTIRPYLESK 
AHG VQKKAYRVLEE VCAS PQGPGAIj F VQSHLEDL KKTLLDS LiRS 
rSSPAKRPRLKCLLHIVRKLSAEHKEFXTALIPEVILCTKEVSV 
GARKNAFAIiliVEMGHAFLRFGSNQEEALQCYIiVIj I YPGLVGAVT 
MVSCS ILALTHLbFEFKGLMGTSTVEQLLENVCLLIiASRTRDW 
KSALGFI KVAVTVMDVAH1AKHVQLVMEAIGKLSDDMRRHFRMK 
LRNLFT\ £CFIPK\ FGILTWGKKAVGPKEYHRVLVNIRKAEARAK 
RHRALS QAAVEEEEEEE EE EEPAQGKGDS I EE I LADS EDEE DNE 
EEERSRGKEQRKLARQRSRAWLKEGGGDEPLNFLDPKVAQRVLA 
TQPGPGRGRKKDHSFKVSADGRLIIREEADGNKMEEEEGAKGED 
EEMADPMEDVI IRNKKHQKLKHQKEAEEEELEI P PQYQAGGSGI 
HRPVAKKAMPGAE YKAKKAKGDVKKKGRPDPYAY I PLNRSKLNR 
RKKMKLQGQFKGI>VKAAQRGSQVGHKNRRKDRRP 


6972 


2179 


973 


PGGAI LLPLWRRTR P REATVPRGAAQRGRARSAEGR IPSSQSPS 
PAEAGGATRSPPPRPPRPARPPGPSAPPLLRSDAGPGATVSAAA 
AAATERARRGATMGAQLSTLGHMVLF P VWFLYS LLMKLFQRS TP 

QHI YLSARIDGNLWRPYTP I SSDDDKGFVDLVIKVYFKDTHPK 
FPAGGKMSQYLESMQIGDTIEFRGPSGLLVYQGKGKFAIRPDKK 
SNPI IRTVKSVGM I AGGTG ITPMLQV I RAIM KDPDDHTVCHLIj F 
ANQTEKD I LLRPELEELRNKHSARFKL W YTLDRAP EA WDYGQG \ 
FVNEEM I RDHLP P PE \EEPLVLMCGP P PMIQ YACLPNL \DHVGH 
PTERCFVF 


6973 


1 


1964 


LQPRC^U^ RGLRAQKCGRPAPGVDAMVI.C PVI GKLLHKR WLAS A 
S PRRQE I LSNAGLRFE WPS KFKEKLDKAS FATP YG YAMETAKQ 
KALEVANRLYQKDLRAPDW I GADTI VTVGGL ILEKP VDKQDAY 
RMLSRFE / SGREHS VFTGVA I VH CS S KDHQLDTRVS E FYEETKV 
KFSELSEELLWEYVHSGEPMDKAGGYGIQALGGMLVESVHGDFL 
NWGFPLNHFCKQLVKLYYPPRPEDLRRSVKHDSIPAADTFBDL 
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SEQ " 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E= - 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Xsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine V=Valine 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDVEGGGSEPTQRDAGSRDEKAEAGEAGQATAEAECHRTRETLP 
PFPTRLLELIEGFMLSKGLLTACKLKVFDLLKDEAPQKAADIAS 
KVDASACGMERLLD I CAAMGLLEKTEQGYSNTETANVYLASDGE 
uo r - Ll iril NiMjji_»i wjMiji* i Xi*t.l?AiKEGTNQHHRAliGKKAEDI>F 
QDAYYQSPETRLRFMRAMHGMTKLTACOVATAFNLSRFSSACDV 
GGCTGAliAR ELARE Y PRMQVTVFDLPD 1 1 ELAAHFQPPGPQAVQ 
IHFAAGDFFRDPLPSAELYVLCRIbHDWPDDKVHKLLSRVAESC 
i\.ir \jnyijuLi v c j. ijijjjciiP^K VAyK>UjMQSi,ijpijjVOTEGKERSIjGEY 
QCLLELKGFHOVQWHLGGVLDAIL\PPKWPPEAQAACSL 


6974 


3082 


2172 


RSCAAFAS FASRP PLELFAPPGSHRSPPGRGVATSAQCALSVRK 
LLAARPGLGTKYQATMVYKTLFALCILTAGWRVQSLPTSAPLSV 
SLPTNIVPPTTIWTSSPQNTDADTASPSNGTHNNSVLPVTASAP 
TSLLPKNISIESREEEITSPQSNWEGTNTDPSPSGFSSTSGGVH 
±ji i iiibLMbiibi ^i^AGVAATLSQSAAEPPTLISPOAPASSPSSL 
STSPPEVFSASVTTNHSSTVTSTQPTGAPTAPESPTEESSSDHT 
PTS HATAE P V PQE KTP PTT VSG KVMCEL I DM ET\ P P P FPG 


6975 


2 


500 


RPRPTraCCKWALKLETAMETLIWFHAHSGKEGDKYKI.SKKEi, 
KELL.QTELSGFLDVKELML* ATEALKTFEEA+ KSPI IQCSSSRS 
SLP PAPQPPP Yk* LSAVPFP IHLPLPLLPPQAQKDVDAVDKVMK 
ELDENGDGEVDFQE Y WIjVAALTVACNNFFWENS 


6976 
~6977 


1216 


970 


GCQIj*VAYGTTENSPVTFAHFPEDTVEQKAESVGRIMPHTEARI 
MNMEAGTJjAKIJ^PGELCIRGYCVMLiGYWGEPQKTEEAVDQDKl^ 
YWTGD VATMNEQG FCK I VGRSKDMI I RGGENI YPAELEDFFHTH 
PKVQEVQWGVKDDRMGEEI CACI RLKDGEETTVEEI KAFCKGK 

ISHFKIPKYIVFVTNYPLTISGKIQKFKLREQMERHLNL+rKQQ 
ACPGRLA 




1298 


588 


SLFINTNLLSNQIRKTSFGMCSEPISDNTEDQKGKLKTPDFA*R " 

ANKKSKHHVNGNRTVEPFPEGTQMAVFGMGCFWGAERKFWVLKG 

VYSTQVGFAGGYTSNPTYKEVCSEKTGHAEWRWYQPEHMSFE 

ELIiKVFWENHDPTQGMRQGNDHGTQYRSAIYPTSAKQMEAALSS 

KENYQKVLS EHGFG P I TTD I REGQTF YYAEDYHQQYLS KNPNG Y 

CGJLGGTGVSCPVGIKK 


6978 


3 


242 


SFPFRDSRRCGCCKGSSLRHTAVAMVKDSKEAKQRIjQQLFKGSQ 
FAIRWGF I PLVI YLGFKRGADPGMPEPTVLSLliWG 


6979 
6980 


3917 
1 


1146 
420 


dearvrgeavaaailsrcrhwsgpppfppsppdrkglrgtepwe 
agpgsgatpgaramdvrrlkvnelreelqrrgldtrglktelae 
rlqaaleaeepddereldaddepgrpghineeveteggselegt 
aqppppglqphaepggysgpdghyamdnitrqnqfydtqvikqe 
nesgyerrplemeqqqayrpemktemkqgaptsflppeasqlkp 
drqqfqsrkrpyeenrgrgyfehredrrgrspqppaeededdfd 
ixtlvaidtyncdlhfkvardrssgypltiegfaylwsgarasyg 
vrrgrvcfemkineeisvkhlpstepdphwrigwsldscstql 
geepfsygyggtgkkstnsrfenygdkfaendvigcfadfecgn 

DVEItS FTKNGKWMG I AFR I QKEALGGQAL YPHVL V2CNCAVE FNF 

gqraepycsvlpgftfiqhlplserirgtvgpkskaeceilmmv 

GLPAAGKTIWAI KHAASN PS KKYNI LG TNA IMDKMRVP4GIiRRQR 

n yagr wdvli qqatqc^riilqiaarkkrnyiiidqtwvygsaqr 
rkmrpfegfqrkaivicptdedlkdrtikrtdeegkdvpdhavl 
emkanftiipdvgdfldeviifielqreeadklvrqyneegrkagp 
ppekrfdnrggggfrgrgggggfqryenrgppggwrggfqnrgg 
gsggggnyrggfnrsggggysqnrwgnnnrdnnnsnnrgsynra 
pqqqpppqqppppqpppqqpppppsysparnppgastynknsni 
pgssantstptvssysppqsfgffpstfqpsysqppynoggysq 
g ytap p p p p ppp payn ygs yggynpap yt pp p pptaqty pq ps y 
nqyqqyaqqwnqyyqnqgqwppyygnydygsysgntqggtstq 

GTRGRKTGRVAAPSTRRRTGNMQKLQTRSPAMSI.SDPGLGYHPT 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rtcuicceu end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A== Alanine, C=Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F=Phenylalanine, G==Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
^-Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Ser ine , T=Threonine , V*Val ine , 
W=Tryptophan, Y~Tyrosine, X=Unknown, *=Stop 
uoaon, / -possiDjLe nucj.eou3.cie aeJ.etxon» 
\=possible nucleotide insertion) 








CWTLRW P PLCS LHAIjHVFHCIjFSSRIjGTP VS PR LAMD PMCS CE A 
GGSCACAGSCKCKKCKCTSCKKSCCSCCPLGCAKCAQGCICKGA 
SEKCSCCA 


6981 


10 


1054 


PGRGFRRAS LR PAFAARG VFQGGLGQAKQARTRACAALPTPHP S 
APRLLEPQGVFSLFPPPPGPWPNMIIiTKAQYDEIAQCLVSVPPT 
RQSLRKLKQRFPSQSQATLLS I FSQE YQKHI KRTHAKHHTSEAI 
ES YYQRYI^GWKNGAAPVLLDLANEVDYAPSLMARLI bERFXQ 
EHEETPPS KS I INSMLRDPSQ I PDGVLANQVYQCIVNDCCYGPI* 
VDCI KHAIGHEHEVLLRDLIiLEKNLS FLDEDQLRAKGYDKTPDF 
I LQV PVAVEGH1 1 HW I ESKASFGDECSHHAYLHDQFWS YWNRFG 
PGLVIYWYGFIQELDCNRERGILLKACFPTNIVTLCHSIA 


6982 


153 


1285 


FPQQDCSAPAAPGLAGSEPRRLRAYRRRRQRARGIiKRVAWLAPP 
PSLLQGLQGWAQAPVDGTLGPEDSRASSPMIQNSRPSLLQPQDV 
GDTVETLMLHPVIKAFliCGS I SGTCSTl^FQPI^LLKTRLQTLQ 
PS DHGSRRVGMLAVLLKVVRTESLLGLW KGMS PS I VRC VPG VG I 
YFGTIiYSLKQYFLRGHPPTALESVMLGVGSRSVAGVCMSPITVI 
KTR YESGKYG YES I YAALRS I YHSEGHRGLFS GLTATLLRDAP F 
SG I YLMFYNQTKNI VPHDQVDATLI PITNFS CGI FAG ILASLVT 
QPADVI KTHMQLYPLKFQWIGQAVTLIFKDYGLRGFFQGGI PRA 
LRRTLMAAMAWTVYEEMMAKMGLKS 


6983 


82 


773 


EMSFLQDPS FFTMGMWS IGAGA^AAALALLIaANTDVFliSKPQK 
AALEYLFJ^IDLKTIiEXEPRTFKAKEIjWEKNGAVIMAVRRPGCFL 
CREEAADIiSSLKSMIiDQLGVPLYAWKEHIRTEAnCDFQPYFKGE 
IFLDEKKKFYGPQRRKMMFMGFIRLGVWYWFFRAWNGGFSGNLE 
GEGFILGGVFVVGSGKQGILLEHREKEFGDKVNLLSVI.EAAKMI 
KPQTIiASEKK 


6984 


1845 


1282 


GGRSAYSLPAGS LPRVPATAAAKMASGVQVADEVCRI FYDMKVR 
KCSTPEE I KKRKKAVI FCLSADKKCI 1 VEEGKE I I»VGDVG VT I T 
DPFKHFVGMLPEKDCRYALYDASFETKESRKEELMFFLWAPBLA 
PLKS KM I YAS S KDAIKKKFQGIKHECQANGPEDLNRACI AEKLG 
GSLIVAFEGCPV 


6985 


1887 


| 1324 


RRTAG I YPCF PKPGRTRHALCS VVXI*liL>TGQLAFDDFQES CAMM 
WQKYAGSRP^MPLGARILFHGVFYAGGFAIVYYLIQKFHSRALY 
YKIAVEQI^SHPEAQEALGPPLNIHYLKLIDRENFVDIVDAKLK 
IPVSGSKSEGLLYVHSSRGGPFQRWHLDEVFIiELKDGQQIPVFK 
LSGENGDEVKKE 


6986 


642 


1350 


YHLYFKMGDPNSRKKQALNRLRAQLRKKKESIiADQFDFKMYIAF 
VFKEKKKKSALFEVSEVI PVMTNNYEENILKGVRDSS YSLESSL 
Ej^QKDWQLHAPRYQSMRRDVIGCTQEMDFILWPRNDIEKIVC 
llfsrwkesdep frpvqakfefhhgdyekqflhvlsrkdktg iv 
VNNPNQSVFLF I DRQHLQTPKNKATI FKLCS I CL YLPQEQLTHW 
AVGTI EDHLRPYMPE 


6987 


1623 


341 


leaaekasrafkesqrqtdsknyetenwspqksqrrydmyntac 

FLGEI EVGLYTIQILQLTPFFHKF^ELSKKHMVQFLSGKWTI PP 

dprnecylalskftshlknlqsdlkrcfdffidymvllkmrytq 
keiaeiml*skkvsrcfrkytelfchldpcllqskesqllqeenc 
rkklealradrfaglle ylnpnykdattmes ivneyafllqqns 
kkpmtne kqns iiiani ilsclkpnskliqplttlkkqlrevlqf 

VGj^HQYPGPYFIiACLLFWPENQELDQDSKLIEKYVSSLNRSFR 
GQYKRMCRS KQASTIjFYLGKRKGLNS I VHKAKIEQYFDKAQNTN 

slwhsgdvwkknevkdllrrltgqaegklisveygteekikipv 
isvysgplrsgrniervsfylgfsiegppgl 


6988 


3 


689 


tqllrrpavfvgsaasgirsglwsassghwcapaagrahapvpr 
lvrglgaastaapqdaqtgpqpmpradcimrhlpyfcrgqwrg 
fgrgskqlgiptanfpeqwdnlpadistgiyygwasvgsgdvh 
kmws igwnpyykntkksmethimhtfkedfygeilnvaivgyl 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RPEKNFDSLESLISAIQGDIEEAKKRLELPEHLKIKEDNFFQVS 
KSKIMNGH 


6989 


2 


1118 


LMPSDRPbSPSTHASAGSHCHAPPTTAJRRAFPIPFGSKSNMATL 
KDQLI YNLLKEEQTPQNKITWGVGAVGMACAI S I LMKDLADEL 
ALVD VI E D KL KG EMMDLQHGS L FLRTP K I VS G KD YNVTANS KLV 
IITAGARQQEGESRLNLVQRNVNIFKFIIPNWKYSPNCKLIilV 
SNPVDILTYVAWKISGFPKNRVIGSGCNLDSARFRYLMGERLGV 
H P LS CHG WVhGEHGD S S VP VWS GMNVAG VS L KTLHPDLG TDKD K 
EQWKEVHKQWESAYSVIKLKGYTSWAIGLSVADLAESIMKNLR 
RVHPVSTMI KGLYGI KDDVFLS VPCI LGQNG ISDLVKVTLTSEE 
EARLKKSADTLWG IQKELQF 


6990 


719 


258 


THASGMASWLALRTRTAVTSLLSPTPATALAVRYASKKSGGSS 
KNXiGGKSSGRRQGIKKMEGHYVHAGNI IATQRHFRWHPGAHVGV 
GKNKCLYAbEEGIVRYTKEVYVPHPRNTBAVDLITRLPKGAVLY 
KTFVHWPAKPEGTFKLVAML 


6991 


169 


451 


RRSSDFHNPGFLSRPVSLRENIHHQVICSTKNKRRNPKKIAYLL 

SSLLMTNLNPNESTENQPVDAYWAFTLDQEFLTYACVEGTGCLF 
CGRHVH 


6992 


j 944 


510 


RQAPGCSSI±ALRQVRQVYCGLVRAPQVQTRPLSSRFVERRGA1.Y 
RS PMNQEN P P P YPGPG PTAP YP P YP PQPMG PG PMGG P YPP PQG Y 

PYQGYPQYGWQGGPQEPPKTTVYWEDQRRDEIX3PSTCLTACWT 
ALCCCCLWDMLT 


6993 


1 


374 


QW C VTC PQHNARQGPAVPPGI QA YGAAPFEDI#Q VDFTEMS KCRG 
DR VW I KNWNVASLCPLWKGPQTVVLS P PTAVKVEG I PAW I HHSH 
VKPAARETV7EARPS PDNP FR VTLKKTTS PAP VTPGS 


6994 


346 


1100 


QWPEKDPVMAASSISSPWGKHVFKAILMVLVAIiII^HSAIAQSR 
RDFAPPGQQKREAPVDVLTQIGRSVRGTLDAWIGPETMHIiVSES 
SS QVLWAI S SAI SVAFFAliSG IAAQLLNALGLAGDYLAQGLKLS 
PGQVG/TFLLWGAGALVVYWLLSLLLGLVLALLGRI LWGLKLVI F 
LAG F VALM RS VPDPSTRALLLLALL I LYALL SRLTGSRASGAQIi 
EAKVRGLERQVEELRWRQRRAAKGARSVEEE 


6995 


144 


1346 


GS VAVGLSG IMAAQKDLWDAI VI GAG I QG CFTAYHLAKHRKR I Ij ' 
LLEQFFL PHSRGS SHGQSRI IRKAYLEDFYTRMMHECYQ I WAQL 
BHEAGTQLHRQTGLLLLGMKENQELKTIQANLSRQRVEHQCLSS 
EELKQRFPN I RLPRGEVGLLDNSGG VI YAYKALRALQDAI RQLG 
G I VRDGEKWE INPGLLVTVKTTSRS YQAKSLVI TAGP WTNQLL 
RPLGIBMPLQTLRIxgVCYWREWVPGSYGVSQAFPCFLWLGLCPH 
HI YGLPTGE Y PGLMKVS YHHGNHAD PEERDCPTARTD I GDVQI L 
SS FVRDHLPDLKPEPAVI ESCMYTNTPDEQFI LDRHPKYDNI VI 

GAGFSGHGFKIAPWGKILYELSMKLTPSYDLAPFRISRFPSLG 
KAHL 


6996 


543 


1942 


ETANAEAAARKSAMDWKEVLRRRIATPNTCPNKKKSEQELKDEE 
MDLFTKYYSEWKGGRKNTNEFYKTI PRFY YRLPAENEVLLQKLR 
JiJi^KAVirijQKKSKELLDNEELQ^LWFLLDKHQTPPMIGEEAMIN 
YEN FIiKVGE KAG AKCKQF FTA KVFAKLLHTDS YGR I S IMQFFNY 
VMRKVWLHQTR IGLSLYDVAGQGYLRESDIjENYI LELIPTLPQL 
DGLEKS FYS FYVCTAVRKFFFFLDPLRTGKIKlQDILA.es FLDD 
LIiELRDEELSKESQETNWFSAPSALRVYGQYLNLDKDHNGMLSK 
EEIiSRYGTATMTWFLDRVFQECLTYDGEMDYKTYLDFVLALEN 
RKEPAALQYIFKLLDIENKGYLNVFSLWYFFRAIQELMKIHGQD 
P VS FQDVKDE I FDMVKPKD PLKISLQDL I NSNQGDT VTT I L I DL 
NGFWTYENREALVANDSEWSADLDDT 


B997 


370 




1104 


AMELTIFILRLAI YILTFPLYLLNFLGLWSWI CKKW FPYFLVRF " 
TVI YNEQMAS KKRELFSNLQE FAGPSGKLSLLE VGCGTGANFKF 
YPPGCRVTCIDPNPNFEKFLI KS I ABNRHLQFERFVVAAGENMH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of I 
amino acid \ 
sequence 


Predicted end 
nucleotide 
location 
cor re sp ond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=*Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 

S=Serine, T=Threonine , V=VaIine, 
W=Tryptophan # Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








QVADGSVDVWCTLVLCSVKNQERILREVCRVLRPGGAFYFMEH 
VAAECSTWNYFWQQVLDPAWHIiliFDGCNLTRESWKAIiERASFSK 
LKLQHIQAPLS WELVRPHI YGYAVK 


6998 


2 


616 


FVSRALLRVRSRRHPAEERAAPGRPEDAPIECPGATNCPEPLWC 
SHLPVPYAPPTMESRGKSAS S PKPDTKVPQVTTEAKVPPAADGK 
APLTKPSKKEAPAEKQQP PAAPTTAPAKKTSAKADPALLNNHSN 
LKPAPTVPSSPDATPEPKGPGDGAEEDEAASGGPGGRGPWSCEN 
FNPIiLVAGGVAVAAIAIiILGVAFLVRKK 


6999 


14 


1591 


GRAGACSRRDTAMSIEIESSDVIRLIMQYLKENSLHRALATLQE 
ETTVSLNTVDS I ESFVADINSGHWDTVLQAIQSLKLPDKTLIDL 
YEQWLELIELREIjGAAR S LXiRQTDPMIMlrKQTQPER Y IHI»ENI* 

lars yfdpreaypdgss kekrraai aqalagevs wppsrjlmal 
lgqalkwqqhqgllppgmtidlfrgkaavkdveeekfptqlsrh 
ikfgqkshvecarfspdgqylvtgsvdgfievwwfttgkxrkdii 
kyqaqdnfmmmddavlcmcfsrdtemlatgaodgkikvwkiqsg 
qci>rrferahskgvtclsfskdssq i lsasfdqti rihglksgk 
tlke frghss f vneatftqdgh y 1 1 sassdgtvki wnmkttecs 
ntfks lgs tagtdi tvnsvillpknpehfwcnrsntwimnmq 
gqivrsfssgkreggdfvccalsprgewiycvgedfvlycfstv 
tgklertltvhekdvigiahhphqnliatysedgllklwkp 


7000 


2 


827 


GPGWFLELMESEGPPESERSEFFSQREEENEEEEAQEPEETGP 
KNPIiLQPALTGDVEGLQKIFEDPENPHHEQAMQLLLEEDIVGRN 
LL.YAACMAGQSDVI RALAKYGVNLNEKTTRGYTLLHCAAAWGRL, 
ETLKALVEIjDVDI EALNFREERARDVAARYSQTECVEFLDWADA 
RLTIiKKY LAKVSIAVTDTEKGSGKLtiKEDKNTI LSACRAKNEWt, 
ETHTEAS INELFEQRQQLEDIVTPIFTKMTTPCQVKSAKSVTSH 
DQKRSQDDTSN 


7001 


2056 


844 


RRCIil IAFLKGCFI FIYFIFIFETEFIiSCCPGWSAVAQSRIjIAN 
FASQVQAIFILPKDSQVGPDVKSEAAPKRAIjYESVFGSGE I CX3P 
TSPKRLCIRPSEPVDAVVVVSVKHDPLPLIiPEANGHRSTNSPTI 
VSPAI VS PTQDSR PNMSRPL I TRS PAS PLNNQG I PTPAQI»TKSN 
APVHIDVGGHMYTSSLATLTKYPESRIGRLFDGTEPIVLDSLKQ 
HYFI DRDGQMFRY ILNFLRTSKLLIPDDFKDYTLLYEEAKYFQL 
QPMLLEMERWKQDRETGRFSRPCECLWRVAPDLGERITLSGDK 
SLIEE VFP E IGDVI-lCNSVNAGV5fNHDSTHV I RFPLNG YCHLNSVQ 
VIiERIiQQRGFE X VGS CGGGVDS S QFS E YVIiKKEL»RRTPKVFSVI 
RIKQEPLD 


7002 


1043 


498 


PMPS S TRWTTS * T YTDTS S AWACRPTTGTCT* TAAPGPTVR W WP 
TPCSRHQSRRRLTCWCSTSRPCGR*GGLCVRTAPTRPTTSASSS 
SWTSAGTSWPAGRRTGTATSGTATTTSVWPGCGTRMWSTQWSSV 
PRSRSCCSRPATTPPSKPGAPHAPCASSRHLAHGLAPSSPGLPA 
RGAEVC 


7003 


818 


61 


QGRFRAFCWQRDFLQPPGMRIiSAIiliAIiASKVTIiPPHYRYGMSPP 

GKQGKWQ VI RQRN W VWGGLN THYR Y I GKTMDYRGTMI PSEAP 
LLHRQVKLVDPMDRKPTEIEWRFTEAGERVRVSTRSGRI I PKPE 
FPRAIXSIVPETWIDGPKDTSVEDAI^RTYVPCLKTLQEEVMEAM 
GI KETR\NTRRSIG IEPGABQLLPNFCPSLEG 


7004 


121 


2285 


FLtiPVLTSRSIiRQPAVPHARLGGVEPAAMKSARAKTPRKPTVKK 
G\ PKRT1>KTQLG/ Y YCRVRP LGFPDQECCI EVIWNTTVQIiHTPE 
GYR1»NRNGD YKETQYS FKQVFGTHTTQKELFDWAN PLVNDL IH 
GKNGLLFTYGVTGSGKTHTMTGSPGEGGLLPRCLDM I FNS IGS F 
QAKRYVFKSNDRNSMDIQCEVDALLERQKREAMPNPKTSS SKRQ 
VDPEFADMITVQEFCKAEEVDEDSVYGVFVSYIEIYNNYIYDLL 
EEVPFDPINPNLHNLNCFVKIKNHNMYYAGCTI^EVKSTEEAFE 
VFWRGQKKRRIANTOLNRESSRSHSVFNIKIjVQAPLDADGDNVIi 
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ID 
NO: 
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location 
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amino acid 
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Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H~Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=*Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X« Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QEKEQ1TISQLSLVDLAGSBRTNRTRAEGNRLREAGNINQSL.MT ' 
I^TCWDVLRENQ^GTNKMVPYRDSKLTHLFKNYFDGEGKVRMI 
VCVNPKAEDYEENLQVMRFAE VTQEVEVARPVDKAI CGLTPGRR 
YRNQPRGPV IGNEPLVTDW1*0<? FPPT.P<?rPTT.nTMnrnTT ddt 
I EALEKRHNLRQMMIDEFNKQSNAFKALLQEFDNAVLS KENHMQ 
GKLNEKEKMISGQKLBIERLEKKNKTIjEYKIEILEKTTTIYEED 
KRKLQQELETQNQKI^RQFSDKRRLEARLQGMVTETTMKWEKEC 
ERRVAAKQLEMQNKIiWVKDEKLKQLKAIVTEPKTEKPERPSRER 
DREKVTQRS VSPS PVP VS YL 


7005 


63 


876 


RNMALYQRWRCLRLQGLQACRLHTAWSTPPRWLAERLGLFEEI» 

WAAOVKT?T.3iQMAf*llf"l?WD r PT VICT nOrni^TnlM 7"7\ r.mrnmrMmr — . . 
irnn^v ivK-ia/t^rjMyJPOC.Jfc'Kl XJ\.±oX»PWjyKiJJAVAWNTTPYQIiARQ 

I S STLADTAVAAQVNGEPYDLERPLETDSDLRFTiTFDS PEGKAV 
FWHSSTHVLGAAABQFliGAVLCRGPSTEYGFYHDFFLGKERTIR 
GSELPVLERICQELTAAARPFRRLE^SRDQLRQLFKDNPFKLHL 
IEEKVTGPTATVYGCX5TLVDLCQGPHDRHTGQIGGLKIjLSNSSS 
LWRSSG 


7006 


22 


898 


NAFGRHSTAVKMAAAAWLQVLP VI LLLLGAHPS PLS FFSAGPAT 
VAAADRSKWH I PIPSGKNYFS FGKI LFRNTTI FLKFDGE PCDLS 
x a nxLJUsAUwxNEI YNrKAEEVEIjYijEKIiKEKRGLSGKYQTS 
S KLFQNCS ELFKTQTFSGDFMHRIiPIiIiGEKQEAKENGTNLTFIG 
DKTAMHEPLQTWQDAPYIFIVHIGISSSKESSKENSLSNIiFTMT 
VEVKGP YE YLTTjEDY PLMI F FMVMC I VYVLFGVLWIaAWS ACYWR 
DLLRIQFWIGAVI FLGMLEKAVFYAGFQ 


7007 


2 


1001 


AMTVSGPGTPEPRPATPGASSVEQLRXEGNELFKCGDYGGALAA '" 
YTQALGLDATPQDQAVI^RNRAACHLKIiED YDKAETEAS KAI E K 
DGGDVKALYRRSQALEKLGRLDQAVLDLQRCVSLEPKNKVFQEA 
LRNIGGQIQEKVRYMSSTDAKVEQMFQILLDPEEKGTEKKQKAS 
QNLWIiAREDAGAEKI FRSNGVQI»LQRI»IiDMGETDLMIjAAIjRTI* 
VG I CS EHQSRTVATLS I LGTRR WS I LG VES QAVS LAACH1.I*Q V 

i ir urujxvavj V JVf^r KCaiVEitxHJL J. Vot WK.O V WGIjIjD VT VMEGMGliSQ 

PGQFFGDQTCSCRI»FGIRFGDI I LI* 


7008 


70 


1478 


CRSALGHERPPPAHLPAGGRRLQTCPRSCRWIiGRPPSGI*PPGPR 
SPPPIAGPGQKMVQKKPAELQG FHRS FKGQNPFELAFSLDQPDH 
GDSDFGLOCSARPDMPASQPIDIPDAKKRGKKKKRGRATDSFSG 
R FEDVYOLOED VLGEG AHAR WYrr* T MT . T T<5 Aw vauv ttp tr adpu 

IRSRVFREVEMLYQCQGHRNVLELIEFFEEEDRFYJjVFEKMRGG 
S IIiSHIHKRRHFNELEAS VWQDVASALDFLHNKG IAHRDLKPE 
NI LCEHPNQVS PVKI CDFDLGSG I KLNGDCS P ISTPELI/TPCGS 
AE YMAPEWEAFSEEAS I YDKRCDLWSLGVXLYIUUSG YPPFVG 
RCGSDCGWDRGEACPACQNMLFES IQEGKYE FPDKDWAHI SCAA 
KDLISKLLVRDAKQRLSAAQVLQHPWVQGCAPENTLPTPMVLQR 
WDSHFLLPPHPCRIHVRPGGLVRTVTVNE 


' 7009 


1 


626 


ARQLRNSW VDDFVAAPIil PLSQQI PTGNSLYES YYKQVDPAYTG 
RVGASEAAIiFLKKSGLSDIILGKITOLADPEGKGFIiDKQGFYVA 
LRLVACAQSGHEVTLSNLNLSMP P P KFHDTS S P LMVTP PSAEAH 
WAVRVEEKAKFDGI FESLLPINGLLSGDKVKPVLMNSKLPLDVL 
GRVWDLSD IDKDGHLVRDEFAVAMHLVYRALE 


7010 


79 


571 | 


SHTRRA WP ETIitiS PLCPLLGGGTAMSGGEQKPERYYVGVDVGT 
GS\mAAI>VDQSGVLIiAFADQPIKNWEPQFNHHEQSSEDIWAACC 
WTKKWQGIDLNQI RGLGFDATCSLWLDKQFHPIjPVNQEGDS 
HRNVIMWLDHRAVSQVNR INETKHS VLQYVGG 


7011 


3 


994 


R I QTLPNQNQSQTQPLLKTPPAVLQPI APQTTFGVQTQPQPQSL 
LQAQISAASITPLI^TQPQPLLQQPQQKAGLLQPPVRIVSQPQP 
ARRXDPPSRFSGRNDRGDQVPNRKDDRSRERERERRRSRERSPQ 
RKRSRERSPRRERERSPRRVRRWPRYTVQFSKFSLDCPSCDMM 
ELRRRYQNIiYIPSDFFDAQFTWVDAFPLSRPFQLGNYCNFYVMH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Al anine, C=Cyateine f D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glut amine , R^Arginine, 
S=Serine, T= Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








REVESLEKNMAILDPPDADHLYSAKVMLMASPSMEDLYHKSCAL 
AEDPQELRDGFQHPARLVKFLVGMKGKDEAMAIGGHWSPSLDGP 
DPEKD PS VLI KT\AI RCCKALTG 


7012 


1 


2661 


RRAGSVKRGEARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGSENGS EVAAQ PAGLSG PAEVGPGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPGSATPME 
TGIAETPEG\RRTSRRKRAKVEYREMDESLANLSEDEYYSEEER 
NAKAEKEKKLPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDI I SGPQQTQKVFLFI RNRTLQLWLDNPKIQL 
TFEATLQQLEAPYNSDTVLVHRVHSYLERHGLINFGIYKRIKPI. 
PTKKTGKVI I I GSGVSGLAAARQLQS FGMD VTLLEARDR VGGRV 
Air KKGNi VAULGAMV VTGLiGGN PMA VVS KQ VNME LAK I KQKCP 
LYEANGQAVPKEKDEMVEQEF11RLLEATSYLSHQLDFNVLNNKP 
VSLGQALEWIQLQEKHVKDEQIEHWKKIVKTQEELKELLNKMV 
NLKEKIKELHQQYKEAS EVKP PRD I TAE FLVKS KHRDLTALCKE 
YDEIiAETQGKLEEKLQELEANPPSDVYLSSRDRQILDVJHFANLE 
FANATPLSTLSLKHWDQDDDFEFTGSHLTVRNGYSCVPVALAEG 
LDIKLNTAVRQVRYTASGCEVIAVNTRSTSQTFIYKCDAVLCTL 
PLGVLKQQPPAVQFVPPIiPEWKTSAVQRMGFGNLNKWIiCFDRV 
FWDPSVNLFGHVGSTTASRGELFLFWNLYKAPILLALVAGEAAG 
IMENISDDVIVGRCLAIliKGIFGSSAVPQPKETWSRWRADPWA 
RG S YS YVAAGS SGND YDLMAQP I TPGPS I PGAPQ PI PRLFFAGE 
HTIRNYPATVHGAIiLSGLREAGRIADQFIiGAMYTLPRQATPGVP 
AQQSPSM 


7013 


1 


2661 
- 


RRAGSVKRGEARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGSENGS EVAAQPAGLSGPAEVG PGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPGSATPME 
TG IAETPEG \RRTSRRKRAKVEYREMDESLANLSEDEYYSEEER 
NAKAE KEKKLP P P PPQAP PEEENES E PEE P SGVEGAAFQSRLPH 
DRMTSQEAACFPDI I SGPQCTQKVFLFIRNRTLQLWLDNPKIQL 
TFEATLQQLEAP YNSDTVLVHRVH S YLERHGL I NFG I YKR I KPL 
PTKKTGKVI I IGSGVSGLAAARQI^SFGMDVTLLEARDRVGGRV 
AT c KKJSJM i V AD JjGAMVVTGLiGGNPMAVVSKQ VNMELAKI KQKCP 
LYEANGQAVPKEKDEMVEQE FNRLLEATSYLSHQLDFNVLNNKP 
VSLGQALEWI QLQEKHVKDEQI EHWKKIVKTQEELKELLNKMV 
NLKEKI KELHQQYKEASEVKPPRDITAE FLVKS KHRDLTALCKE 
YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
FANATPLSTLSLKHWDQDDDFEFTGSHLTVRNGYSCVPVALAEG 
LDI KLNTAVRQVRYTASGCE V I AVNTRS TSQTF I YKCDAVLCTL 
PLGVLKQQP PAVQF VPPLP E W KTSAVQRMGFGNLNKVVLCFDRV 
FWDPS VNLFGHVGS TTASRGELFLFWNLYKAP I LLALVAGEAAG 
IMENI SDDVI VGRCLAI LKGI FGSSAVPQPKETWSRWRADPWA 
RG S YS YVAAGS SGND YDLMAQPI TPGPS I PGAPQ P I PRLFFAGE 
HTIRNYPATVHGALLSGLREAGRIADQFLGAMYTLPRQATPGVP 
AQQSPSM 


7014 


3 


3950 


DFE VGD KIR I LATLEDG WLEGS LKGRTG I F P YRF VKLCPDTRVE 
ETMALPQEGSLAR I P ETSLDCLENTLGVEEQRHETS DHEAEE PD 
CIISEAPTSPLGHLTSEYDTDRNSYQDEDTAGGPPRSPGVEWEM 
PLATDS PTS DPTEWNGISSQPQ VP FHPNLQKS Q YYSTVGGSHP 
HSEQYPDLLPLEARTRDYASLPPKRMYSQLKTLQKPVLPLYRGS 
SVSASRWKPRQSSPQLHNLASYTKKHHTSSVYSISERIiEMKPG 
PQAQGLVMEAATHS QGDGSTDLDS KLTQQLI EFEKSLAGPGTE P 
DKILRHFS I MDFNSEKDI VRGSS KLI TEQELPERRKALRPPPPR 
PCTPVSTS PHLLVDQNLKPAP PLWRPSRPAPL PPSAQQRTNAV 
SPKLLSRHRPTCETLEKEGPGHMGRSLDQTSPCPLVLVRIEEME 
RDLDMYSRAQEELNLMLEEKQDES SRAETLEDLKFCESNIESLN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serlne, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








MELQQLREMTLLSSQSSSLVAPSGSVSAENPEQRMLEKRAKVIE 
EIjIjQTERD Y I RDIiEMC I ERIM VPMQQAQVPN I DFEG LFGNMQMV 
I KVS KQLLAAIiE I SDAVG P VFLGHR DELEGTY KI YCQNHDEA I A 
LLE I YEKDEKI QKHLQDS LADLKS LYNEWGCTNYINLGSFI.I KP 
VQRVMRYPLLbMEH,NSTPESHPDKVPLTNAVIAVKEl^n7NINE 
YKRRKDLVLKYRKGDEDSLMEKISKLNIHS I IKKSNRVS SHLKH 
Jj ior A.FyiiUJhiVr r.fc.TEM!*FRMQERIjIKSFIRDLSXiYLQHIRES 
ACVKWAAVSMWDVCMERGHRDIiEQFERVHRYISDQLFTNFKER 
TERLVISPIiNQLI^MFTGPHKLVQKRFDKLLDFYNCTERAEKLK 
DKKTLEELQSARNNYEALNAQKLDEIiPKFHQYAQGLFTNCVHGY 
AEAHCDFVHQAJjEQLKPLIiSLLKVAGREGNL IAIFHEEHSRVLQ 
QDQVFTFFPESLPATKKPFERKTIDRQSARKPLLGLPSYMLQSE 
ELRASLLARYP PEKLFQAERNFNAAQDLDVSIjLEGDLVGVI KKK 
DPMGSQNRWLIDNGVTKGFVYSSFLKPYNPRRSHSDASVGSHSS 
TESEHGSSSPRFPRQNSGSTLTFNPN\S\MAVSFTSGSCQKQPQ 
DASPPPKEWDQGTIiSASIJIPSNSESSPSRCPSDPDSTSQPRSGD 
SADVARDVKQ PTATPRS YRN FRHP E I VGYS VPGRNGQSQDLVKG 
CARTAQAPEDRS TE PDGS EAEGNQ VYFAVYT F KARNPNELS VSA 
NQKLKILEFKDVTGNTEWWIAEVNGKKGYVPSNYIRKTEYT 


7015 


1842 


513 


RQAWHE \ VAAPS WRGARLVQS VLRWQVGPHVARERVI P FSSbl* 
GFQRRCVSCVAGSAFSGPRIASASRSNGQGSALDHFLGFSQPDS 
SVTPCVPAVSMNRDEQDVIiVHHPDMPENSRVI,RVVLLGAPNAG 
KSTLSNQIiLGRKVFPVSRKVHTTRCQAIiGVITEKETQVIIjLDTP 

QI^PQLLRCIiTKYSOIPSVLVMNKVDCLKQKSVLLELTAALTEG 
WNGKKDKMRQAFHSHPGTHCPSPAVKDPNTQSVGNPQRIGWPH 
FKE I FMIiS ALS QEDVKTLKQ YLLTQ AQPGPWE YHSAVLTSQT PE 
EI CANI IREKLIjEHLPQEVP YNVQQKTAVWEEG PGGEL VIQQKL 
LVPKESYVKLLIGPKGHVISQIAQEAGHDLMDIFLCDVDIRLSV 
KLLK 


7016 


167 


2513 


I LiNAP KPP PPRDS VEAVAAKRDTGGGS WGTGMDVSGQETDWRST 
AFRQKLVSQ IEDAMRKAGVAHSKSSKDMESHVFLKAKTRDE YLS 
LVARLI IHFRD IHNKKSQASVSDPMNALQSLTGGPAAGAAG IGM 
PPRGPGQSLGGMGSLGAMGQPMSLSGQPPPGTSGMAPHSMAWS 
TATPQTQLQLQQVAAAAAAATARSS SS SSRRR YS SSSS SSNS KQ 
eQJvjQSAMQQ \Q FQA \ WQQQQQu \QQQQQQQQHL I KLHKQNQQ 
QIQQQQOX3LQRIAQLQIX2QQQO^QOH2QQQQQQQAL,QAQPPIQQP 
PMO^PQPPPSQALPQQLQQMHHTQHHQPPPQPQQPPVAQNQPSQ 
LPPQSQTQPLVSO^QAIiPGQMIiYTQPPIiKFVRAPMVVQQPPVQP 
QVQQQQTAVQTAQAAQMVAPGVQVSQSS LPMLSSPS PGQQVQTP 
QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF 
\QS PVTARTPQNFS VPSPGPLNTP VNPS S VMS PAGSSQAEEQQ Y 
LDKLKQLS KYI EPLRRMINKI DKNEDRKKDLSKMKSLLDILTDP 
S KRCPLKTLQKCB I ALEKLKNDMAVPTPPPPPVPPTKQQYLCQP 
LLDAVLANI RSPVFNHSLYRTFVPAMTAIHGPPI TAP VVCTRKR 
RLEDDERQS I PSVLQGEVARLDPKFTiVNLDPSHCSNNGTVHL I C 
KLDDKDLP S VPPLEI*SVPADYP AQS PLW IDRQWQYDANPFLQS V 
HRCOTSRLLQLPDKHSVTAIiLNTWAQSVHQACLSAA 


7017 


1 


1785 


INLGNTCYMNS VI ♦ALFMATDFRRQVLSLNLNGCNSJjMKKLQHI, 
FAFLAHTQREA YAPR I FFEASRP PWFTPRSQQDCSEYLRFLIiDR 
LHEEE K I LKVQASHKPSE ILECS ETSLQE VASXAAVlrTETPRTS 

dgektliekmfggkxirthlrclncrstsqkafaftdlsiiafwps 
ysleymscpdcsqsps iqdgglmqasvpgps eepwynpttaaf 
icdslvnektigsppnefycsentsvpneswkilvnkdvpqkpg 
gettpsvtdllnyflapeiltgdnqyycencasiiqnaektmqit 
eepeylii»tllrfsydqkyhvrrkildnvsi.plvlei,pvkrits 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, • 
H=Histidine, I^Isoleucine, K= Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FSSLSESVJSVDVDFTDLSENLAKKLKPSGTDEASCTKLVPYLLS 
S VWHSG I S SESGH Y YS YARN I TS TDS S YQM YHQSE ALALAS SQ 
SHLLGRDS PSAVFEQDLENKEMS KEWFLFNDSRVTFTS FQSVQK 
ITSRFPKDTAYVLLYKKQHSTWGLSGNNPTSGLWINGDPPLQKE 
LMDAIT KDNKL YLQEQELN ARARAliQAAS ASCS FRPNG FDDND P 
PGS CGPTGGGGGGGFNTVGRLVF 


7018 


484 


1066 


SLV FRGNT WSGEAGH HCSAL PNLAAYHQL F VGTERI RAP E 1 1 FQ 
PS L IGEEQAG I AETLQY I LDR Y P KD VQEMLVQNTVFXTGGNTMY P 
GMKARMEKELLEMRP FRSS FQVQLASNPVLDAWYGARDWALNHL 
DDNEWITRKEYEEKGGEYLKEHCASNIYVPIRLPKQASRSSDA 
QASSKGSAAGGGGAGEQA 


7019 


1048 


335 


APGG FLVTM VFPAPS P PWMLGCCS HE VTAG P PTL CKDMS ALVAA 
RMRHIPLAPGSDWRDLPNIEVRLSDGTMARKLRYTHHDRKNGRS 
S 5GALRG VCS CVEAGKACDPAARQ FNTL I P WCLPHTGNRHNHWA 
GLYGRLEWDGFFSTTVTNPEPMGKQGRVLHPEQHRWSVRECAR 
SQGFPDT YRLFGNI LDKHRQVGNAVP PPLAKAIGLE I KLCMLAK 
ARES AS AK I KEEEAAKD 


7020 


1 


2154 


FADSKRKSVLLDKI KNLQVALTS KQQSLETAMS FVARNT FKRVR 
NG FLMR KVAVFFSNT PTRAS PQL»RE AVLKLS DAG ITPLFLTRQE 
DRQLINALQ INNTAVGHAL.VLPAGRDLTDFLENVLTCHVCLDI C 
NI DPS CGFGS WRPSFRDRRAAG S DVDI DMAF ILDSAETTTLFQ F 
N EM KKYIAYI»VRQLDMS PDPKASQH FARVA WQHAPS ES VDNAS 
MPPVKVEFSLTDYGSKEKLVDFLSRGMTQLQGTRA3LGSAIEYTI 

ENVFESAPNPRDLKIWLMLTGEVPEQQLEEAQRVILQAKCKGY 

f?F\TVT jTZTCZWfC \TtJT tfFW'FK'B d7DMrvTn?l?VT Trr\tr£?T»n , T mcttit x*r\ 
i? X* v vij»ji.vjKA,vrjxiuiv 1 1 r^ioafcTiJLJVi? jp js_l»v Ui\.£» HjXjWEEPLiMR 

FGRLLPSFVSSENAFYLSPDIRKQCDWFQGDQPTKNIiVKFGHKQ 

VNVPNNVTSS PTSNPVTTTKPVTTTKPVTTTTKP VTTTTKPVTI 

INQ PS VKPAAAKPAPAKP VAAKP VATKTATVRPP VAVKP ATAAK 

P VAAKPAAVR P PAAAAAKP VATKPEVPRPQAAKPAATKP ATTKP 

MVKMSREVQVFEITENSAJOiHWERPEPPGPYFYDLTVTSAHDQS 

LVLKQNLTVTDRVIGGLLAGQTYHVAVVCYLRSQVRATYHGSFS 

TKKSOPPPPOPARSASSSTINLMV < ?TPPl>aT.TP"rnTr , VT Dwrror* 

TCRDFILKWYYDPNTKSCARFWYGGCGGNENKFGSQKECEKVCA 
PVLAKPGVISVMGT 


702X 


2 


338 


VNAVS FFPNG YAFATGSDDATCRLFDLRADQELLLYSHDNI ICG 
ITSVAFSKSGRLIJ^AGYDDFNCNVWDTLKGDRAGVIiAGHDNRVS 
CLGVTDDGMAVATGS WDS FLRIWN 


7022 


2 


856 


VYIGSFWSHPLLXPDNRICL.FEAFFOnT.FPnT HQ T ddmhat dittm 
DLI KRARLAKVRAYI I SSLKKEMPS VFGKDNKKKELVNNLAEI Y 
GR I EREHQ I S PGDFPNLKRMQDQLQAQDFSKFQ PLKS KLLEWD 
DMLAHD I AQIiMVLVRQEE S QRPI QMVKGGAFEGTLHGPFGHGYG 
EGAGEG I DDAEW WARDKPM YDE I FYTLSPVDGKITGANAKKEM 
VRSKLPNS VLGKI WKLAD I DKDGMLDDDEFALANHLI KVKLEGH 
ELPNELPAHLLPPSKRKVAE 


7023 


2 


748 


AMVFGG WP Y VPQ YRD IRRTQNADG FSTYVCLVLLVANILR I LF 
WFGRR FES PLLWQSAI M I I*TMLLMLKLCTEVRVANELNARRRS F 
TAADSKDEEVKVAPRRS FLDFDPHHFWQWSS FS D YVQCVLAFTG 
VAGYI T YI*S I DSAIiFVETLGFLA VL TEAMLGVPQL YRNHRHQST 
EG MS I KMVLMWTSGDAFKTAYFLLKGAPLQFSVCGLLQVLVDLA 
I LGQAYAFARH P Q KPAPHAVH PTGTKAL 


7024 


1207 


190 


RTGVTGWAQVWMFGGGGVLSSGEQLQMPVKPERGLGPSDGWLV 
SSRRGS PGTVLGLPFWUCTP VLVSRS I RSMLLLTRS PTAWHRLS 
QLKPPVLPGTLGGQALHLRSWLLSRQGPAETGGQGQPQGPGLRT 
RLLITGLFGAGLGGAWLALRAEKERLQQQKRTEALRQAAVGQGD 
FHLIjDHRGRARCKADFRGQW VLM YFGFTHCPD I C PDELEKLVQV 
VRQLEAEPGLP PVQPVF I TVDPERDD VEAMAR YVQDFHPRLLGL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide *] 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
Ii=Leucihe, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TGSTKQ VAQAS HS YR V Y YNAG P KDEDQD Y I VDHS I AI YLLNPDG 
LFTDYYGRSRS AEQI SDSVRRHMAAFRSVliS 


7025 


232 


832 


ERNSPIGNNENL*K\HSLDCLCFRGDWEGNTQFQTLQDNQEECF 
KQVIRTCEKRPTFNQHTVF^HQRIJTrcDKLNEFKELGKAFlSG 
SDHTQHQL IHTS EKFCGDKECGNTFLPDSE VIQYQTVHTVXKTY 
ECKECG KS FS LRS S LTGH KRI HTGE KP FKCKDCGKAFRFHSQLS 
VHKRIHTGEKSYECKECGKAFSCG 


7026 - 


328 


1146 


N PNP S IGD I KD I KKAAKS MLD PAHKSHFHP VTPSLVF LCF I FDG " 
LHQALLS VG VS KRS NTWGNENEERGTP YAS RFKDM PN F I ALE K 
SS VLRHCCDLLIGVAAGSSDKI CTSSLQVQRRFKAMMAS IGRLS 
HGE S ADLL I S CNAES AI GW I SSR P WVG ELM FTFL»FGDFES PLHK 
LRKSS *LPRKHR*QPINAVRMFI*DQCMDGS1AIjRAIVSEI PVFE 
EKKNNG* KG IGEIF * WGCTLPPHYWGAVTTNVPKLSNSGKLLG 
QDEQPHIFG 


7027 


43 


954 


GRRLQQQQRPEDAEDGAEGGGKRGEAGWEGG YPE I VKEN KLFEH 
YYQELKIVPEGEWGQFMDALREPLPATLRITGYKSHAKEILHCL 
KN KYFKE LED LEMDGQKVEVPQPI>S W YPEELAWHTNLSRKI LRK 
SPHLEKFHQFLVSETESGNISRQEAVSMIPPLLLNVRPHHKILD 
MCAAPGSKTTQLIEMLHADMNVPFPEGFVIANDVDNKRCYLLVH 
QAKRLSSPCIMVVNHDASS I PRLQ I DVDGRKE I LFYDR I LCDVP 
CSGDGTMRKNIDVWKKWTTLNSLQLHGLQLRIATRGAEQL | 


7028 


189 


608 


SRPPPEPEPGTMVEKGSDSSSEKGGVPGTPSTQSLGSKN KIRNS 
KKMQS W YSMLS PTYKQRNEDFRKLFS KLPEAERL I VDYS CALQR 
El LLQGRL YLS ENW I CFYSN I FRWETT I S I QLKEVTCLKKEKTA 
KLIPNAIQ 


7029 


1343 


40 


VLESNTEAKQATGTSSKLRHGTGQEKGREGPRCPSGLAQLRLWG 
/PCPHAGRETGPRASAP1 PGS *GHGWHW*RKDGRGERSEGPSAL 
SPHSPSLLNMQQAPTHVGPGMGSQRPRSSWPEQVGVGSQLSRE 
RWRA* RSLPGAAASERTEMTKERSP/RPCQGYDSSNWFTQPGKK 
TRKRNSRRNTMVSRGGGCLLYPLQS IMPE*QLR*GAHASPPTQG 
R*GKGGPRSPLTKASGTTHI PTPFFGS I P/RPTRDSGPGTDNS\ 
AAPGQKRGHREA *QGPEPV/ WGRVTTHLQGPAG* TKPLGS \ RNW 
VPGPAEGEQGEGAGLEGRP * PLKGCRSTLTFSPQLSI PMVGKKP 
PEGTTASFFP\RSCHSE*RKPPPSCPHAPALSLPHPLPLPLPPL 

PLPLPGAGT*HSARSGRPGQSETGSLCHNCHHCPPHCPKCSPGG 
T 


7030 


2 


521 


FVCFSAPGSGQGGKRRVNMELSAVGERVFAAEALLKRRIRKGRM 
EYLVKWKGWSQKYS TWEPEENI LDARLLAAFEEREREMELYGPK 
KRGPKP KTFLLKAQ AKAKAKTYEFRSDS ARG I R I PYPGRSPQDL 
ASTSRAPJEGLRN \ RVCPRQRAAPAPAAP \ PRRGP SG PGPRPG * G 
PGLHFPG PGGPS KHGFVPAS EQHQHQQHLPRRGPS GPGPRPG 


7031 


960 


59 


HCSVPGAEWPRKPPAQICPQLTSRPHLSSPRSLSPGCGHSPGPG 
/CKPS /RHCDELHEGPSRTAALPCX5KPQPKHGVEECG/ PCPCLA 
PR RLTE P PALTVSP VGRAAPSGAL * PSGRACSACSHRLAP E AAL 
S AAAPRPS LGSGQNASGLPAASLPPQDSSQPHKTVPSPARS VP P 
LGAQARAAPPRLWCPRALVSG* EASPEAVSVAAGPPVPGPTPST 
SGSTASHSRRGC* S PR * TPAP PRRDHGRS AAFE VLTAAASAQP C 
ASQGGPRPTGAGRTPSPLGLPFSRGPPAASARPFCRHPSL 


7032 * 


1393 


2104 


RRPGRTEP VE PP P VP PP PRASNS KSRCR * RNLHLAPL* QSPLR K 
SRQIGTSSLPFGRSAGERPRPAATFCLSRGGSSPVFL*PSSSSL 
EPWMKRQFGRLHSLFWKSWQKMNSFLLTPKLDTSLMSGWRYRQR 
LPRLHTFLKKSLQMASELAPPLPTPAPIASSLPPPPGPPPLLPV 
PLA* LSRSGILVPPNSGFSLSC\ PLGDH * GSSGEVRGS CGSPPP 
HHCWVLPPPP * LLLPPR 


7033 


689 


815 


RSRDCLSSSATSNRARRSKCSGPKRATPLDSGPGP*APPGPSSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I»ysine, 
L=Leucine, M=Methionine , N*=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMMPSSCPWRTGALGPSPAGSRALGRCTSSVGPGSRWLTRTSSP 
GCATRTWRTMRMEPRPLRSRMGESAPGIPAELPSAAPSGPSAPS 
AAAPSAPTTPAAAGPNTL*SRRTAEWCWPPSCS CCWGWC * SWSA 
WDWRRPPLQVSPAPSSSCRASCCWCLESIT+ S SSTARSRATGAS 
SSSTCPTSRSDRGAAWTP\SPMGAPLLPCSVPLISREEALQDPR 
NPSP*GVCSGSSGHAGLALGKPPVACSVP 


7034 


92 


1942 


EDTSSMPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAMFYHAYDSYLENAFPFDELRPLTOTGHDTWGSFSLTLIDALD 
TLL \ TLFY FQ I LGNVSE FQRWEVLQDSVDFDI DVNASVFETNI 
RWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPA 
FQT PTGMP YGTVNLLHG VNPGET P VTCTAG IGT F I VE FATLS S Ii 
TGD P VFED VAR VALMRL W ES RS D I G LVGNH I D VLTGKW VAQDAG 
IGAG VDS Y FE YL VKGAILLQDKKLMAM FLE YNKAIRNYTR FDDW 
YLWVQMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYY 
TVWKQFGGLPEFYNI PQGYTVEKREG YPLRPELI ESAM YLYRAT 
GDPTLLELGRDAVES IEKIS KVECGFATI KDLRDHKLDNRMES F 
FLAETVKYLYLLFDPTNFIHNNGSTFDAVITPYGECILGAGGYI 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QP FTS KLALLGQ VFLDS S * PLDN FF I F I FLRLN YNKLLLA 1 1 KK 
K 


7035 


92 


1942 


EDTSSMPFRLL I PLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAM FYHAYDS Y LENAFP FDELRPLTCDGHDTWGS FSLTLI DALD 
TLL\ TLF YFOILGNVSEFQRVVEVLQDS VDFDI DVNASVFETNI 
RWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKIiLPA 
FQTPTGMP YGTVNLLHGVNPGETPVTCTAGIGTF I VEFATLS SL 
TGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTGKWVAQDAG 
I GAG VDS Y F E YL V KGA 1 LLQDKKLMAM FLE YNKA I RNYTR FDDW- 
YLWVQMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYY 
TVWKQ FGGIjPE F YNI PQG YTVEKREG Y P1»R PEL IES AM YLYRAT 
GDPTLLELGRDAVESIEKISKVECGFATIKDLRDHKLDNRMESF 
FIAETVKYLYLLFDPTNFIHNNGSTFDAVITPYGECILGAGGYI 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QPFTSKLALLGQVFLDSS* PLDNFF I F I FLRLNYNKLLLAI IKK 
K 


7036 


442 


761 


CLAPLFS CFQI INLHLAPSGRLRWAWLRGPGRN* LPGEGPS I PT 
RNW* ERKAGCSQPC/ PAQQHHGRPPGVS PLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7037 


442 


761 


CLAPLFS CFQI INLHLAPSGRLRWAWLRGPGRN* LPGEGPS I PT 
RNW* ERKAGCSQPC/ PAQQHHGRPPGVSPLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7038 


155 


891 


GAGAASDMSSGLRAADFPRWKRHI SEQLRRRDRLQRQAFEEI I L 
QYNKLLEKSDI^SVI^QKiQAEKHDVPNKHfcl I WNDNQ 
LQEMAQLRI KHQEELTELHKKRGELAQ \RVI DLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQI TFTALEGKLRKTTEENQELVTRWMAEKAQEANRLNARE*KR 
LQEAAS PAAERACRSSKGTSTSRTG 


7039 


155 


891 


GAGAASDMSSGLRAADFPRWKRHI SEQLRRRDRLQRQAFEEI I L 
QYNKLLEKSDLHSVLAQKLQAEKHDVPNRHE ISPGHDGTWNDNQ 
LQEMAQLR I KHQEELTELHKKRGELAQ \ RV I DLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQI TFTALEGKLRKTTEENQELVTRWMAEKAQEANRLNARE* KR 
LQEAASPAAERACRSSKGTSTSRTG 


7040 


34 


789 


KITPPRRPHRCSSGHGSDNSSVLSGELPPAMGKTALFYHSGGSS 
GYESVMRDSEATGSAS SAQDSTS ENSSS VGGRCRSLKTPKKRSN 
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SEQ 
ID 
NO * 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
1 sequence 


| Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=:Glycine, 

I H=Histidine, I=Isoleucine, K=I>ysine, 
l»=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonane, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


7041 






PGSQRRRUlPALSIJDTSSPVRKPPNSTGVRWVDGPbRSSPRGLG - 
E P FEI KVYE I DDVERLQRRRGGAS KE AMCFNAKLK I LEHRQQ R I 
AEVRAKYEWLMKELEATKQYLMLDPNKWLSEFDLEQVWELDSLE 
YLEALECVTERIiESRVNFCKAHLMMITCPDlT 


7042 


1 


t 567 


SGRVAMGRRRAPAGGSLGRALMRHQTQRSRSHRHTDSWLHTSEL ' 
NDG YD WGRLNLQS VTEQS S LDD FLATAEI*AGTE FVAE KLN I KFV 
PAEARTGLLSFEESQRIKKLHEENKQFLCIPRRPNV7NQNTTPEE 
LKQAE KDNFL»EWRRQ L \ VRLEE EQ KL 1 LTPFERNLDFWRQLWRV 
IERSDIWQIVDA 




| 7 


j 345 


PIHMAAAAL.RADI \ISPLFPHIQGYLLLSASHG\ATSLHTKGA1i" 
PLETVTMYTVIPKSKYVLVKPDTQYPYSENLDEFKRLAENSASN 
DDLLMAEVAI SDYGDKLTLEIjREKY 


| 7043 
7044 


2 


j 2170 


ARGMAARDSDSEEDLVSYGTGLEPLEEGERPKKPIPLQMTVRI) 
EKGRYKRFHGAFSGGFSAGYFNTVGSKEGWTPSTFVSSRQNRAD 

ksvlgpedfmdeedlsefgiapkaivttddfasktkdrirekar 

QLAAATAP I PGATLLDDIjI TPAKLS VGFELLRKMGWKEGQGVGP 
RVKRRPRRQKPDPGVKlYGCALPPGSSEGSEGPnnnVT DnKRrpp 

APKDVTPVDFTPKDNVHGLAYKGLDPHQALFGTSGEHFNLFSGG 
S ERAGDLGE IGLNKGRKLG ISGQAFGVGALEEEDDD I YATETTiS 
KYDTVLKDEE PGDGI»YGWTAPRQ YKNQ KES EKDLR YVG KI LOG F 
! SLASKPLSSKKIYPPPELPRDYRPVHYFRPMVAAT^PMQHT t n\r 
IiSESAGKATPDPGTHSKHQLNASKRAELLGBTPIQGSATSVLEF 
LSOKDKERIKEMKQATDLJCAAQLKARSIiAQNAQSSRAQPSPAAA 
AGHCSWNMAU3GGTATLKASNFKPFAKDPEKQKRYDEFLVHMKQ 
GQKDALERCLDPSMTEWERGRERDE FARAALLYAS wctt c v 

THAKEEDDSDQVEVPRDQENDVGDKQSAVKMKMFGiO,TRDTFEW 
HPDKLLFQ / Rl»VGliPRVKRDKYS VFNFLTLPETASLPTTQASSE 
KVSQHRGPDKSRKPSRWDTSKHEKKEDSISEFLRLARSKAEPPK 
QQSSPLVNKEEEHAPEIiSAN 


7045 - 


276 j 


734 


E VYLTDEFAKGRKVADLYELVQYAGNI I PRL YLLITVGWY VKS 
FPQSRKDILKDLVEMCRGVQHPI^GLFLRNYLLQCTRNILPDEG 
EPTDEETTGDISDSMDFVLLNFAEMNKLWVRMQHQGHSRDREKR 
ERERQELR IL VG TNLYRLSQ V 




3 


513 


LGFKMEALSRAGQEMStAALKQHDPYITSIADLTGQVAiYTFCP 
KANQWEKTDIEGTLFVYRRSASPYHGFTXVNRLNMHNLVEPVNK 
DLEFQLHEPFI^YRNASLSIYSIWFYDKNDCPIRIAKIJviADVVEE 
ETRR5QQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSOG 


7046 
704 7 


3 I 


513 T 


I-«G FKWEAL»SRAGQEMSI»AAL»KQHDP Y ITS IADLTGQVALYTPCP 
KANQWEKTDIEGTLFVYRRSASPYHGFTrVNRLimHNLVEPV^ 
DLEFQIiHEPFI^YRNASLSIYSlWFYBKNDCHRIAKLMADVVEE 
BTRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7048 


103 


466 J 


QMKI EKCGWS EGLTS I KGNCHNF YTAISKDVTY KELKNLLNSKN ' 

IML I DVREI WE I LEYQKI PESINVPLQEVGEALQMNPRDFKEKY 
NEVKPSKSDS / TVR^VT.?mn5cvv7i t n>PATor ^, 


7049 


92 


627 I 


FFCLTLLSSWuyRHHATRRVISSPVFTMEDSGKTFSSEEEEANY 
WKDLAMTYKQRAENTTQBELREFQEGSREYEAEIjETQLQQIETRN 
RDLLSENNRLRMELETI KEKFEVQHSEGYRQI SALEDDIiAQTKA 

ikdqlqkyireleqanddlerakratdhglsktfeVqrlnXqai 

EKKW 


7050 


393 

393 [- 


938 

938 I 


KRTGSASYGufPPUlXiGPATXASVAGRCSSVGKIPARRCYEDEL 
^VFEAVGRIYELRLMr^FT>GKNRGYAFVMYCHKHEAKRAVREL 
NNYElRPGRI^VCCSVDNCRJbFIGGIPKMKKREEILEEIAKVT 
EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKIilAWX 
&SSLWG 

tCRTGSASYGGPPPGLGGPATXASVAGRCSSVGKIPARRCYEDEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, NsrAsparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
"^Tryptophan, Y-Tyroeine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion) 








VPVFEAVGRIYEIiRLr^MDFDGKNRGYAFVMYCHKHEAKRAVREL 
NNYE IRPGRliLG VCCS VDNCR L FIGG I P KM KKRE E I LE E I AKVT 

EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAV7X 
ASSLWG 


7051 


119 


816 


KKMNLAEICDNAKKGREYALLGNYDSSMVYYQGVMQQIQRHCQS 
VRDPAIKGKWQQVRQELLEEYEQVKSIVGTLESFKIDKPPDFPV 
SCQDEPFRDPAVWPPPVPAEHRAPPQ1RR/RQSRSKTSEERNGR 
SRSPGTCRPST\PISKSEKPSTSRDKDYRARGRDDKGRKNMQDG 
ASDGEMPKFDGAGYDKDLVEALERDIVSRNPSrHWDDIADLEEA 
KKLLREAGVLPMWM 


7052 


467 


715 


SCPGRGKMSKIiLNPEEMTSRDYYFDSYAHFGlHEEMIiKDEVRTI^ 
TYRNSMYHNKHVFKDKWLDVGSGTGILSMFAARQGPRR 


7053 


467 


715 


SCPGRGKMSKLLNPEEMTSRDYYFDSYAHFGIHEEMLKDEVRTL 
TYRNSMYHNKHVFKDKVVLDVGSGTGIIiSMFAARQGPRR 


7054 


1 


1036 


GTSQRSRETDARRRSAGAEPTARLPWPAALEEWPSCPCEPLGPG ~ 
RRCRWDAMEYDEKLARFRQAHLNPFNKQSGPRQHEQGPGEEVPD 
VT P EE ALPE LP PGEPE FRCPERVMDIjGIjS EDH FSRP VGL FLASD 
VQQItRQAIEECKQVILELPEQSEKQKDAWRLIHIiRLKLQELKD 
PNEDEPNIRVIiLEHR F YKEKSKS VKQTCDKCNT 1 1 WGL I QTWYT 
ClOCYYRCHSKCIiNIjISKPCV'SSKVSHQAEYEIjNICPETGIjDSQ 
DYRCAECRAPI /CS/DGWPSEARQCDYTGQYYCSHCHWWDLAV 
IPARWHNWDFEPRKVSRCSMRYLAI»MVSRPVLRLREIN 


7055 


! 2 


527 


DSRRVSWRSWLANE/WGKHLCLFIWLSMNVLLFWKTFLLYNQGP 
EYHYLHQMLG/ALCLSRASASVLNLNCSLHiLPMCRTLIiAYLRG 
SQKVPSRRTRRLLDKSRTFHI TCGATI CI FSGVHVAAHIiVNALN 

FSVNYSEDFVFILNAARYRDEDPRKLIjFTTVPGIjTGVCMEVVIjFIj 
M 


7056 


2 


527 


DSKRVS WRS WliANE / WGKHLCIiF I WLSMNVLLF WKTFIX YNQG P 
E YHYLHQMLG / ALCLSRAS AS VLNLNCSLI LLPMCRTLIAYLRG 
SQKVPSRRTR RiLDKSRTFHI TCGATI CI FSGVHVAAHLVNALN 
FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEVVI,FI, 


7057 


1368 


431 


G I YLHVNEKI PRPTCIGDRQENDKENLNLENHRDQELLHASCQA 
SGEVPSQASLRGFFTEDEPGCFGEGENIiPEALQNIQDEGTGEQL 
S PQERI SEKQLGQHIiPNPHSGEMSTMWLEEKRETSQKGQPRAPM 
AQKLPTCRECGKTF YRNSQLI FHQRTHTGETYFQCTI CKKAFLR 
SSDFVKHQRTHTGEKPCKCDYCGKGFSDFSGLRHHEKIHTGEKP 
YKCPICEKSFIQRSNFNRHQRVHTGEKPYKCSHCGKSFSWSSSL 
DKHQRSHLGKKPFQ * PVTKLS FP I S I SQPSHKNTQLHQEEIjCIjR - 
GYPC 


7058 


1 


469 


FSGFGAVPDAIiGCRMSDIiRITEAFLYMDYLCFRALCCKGPPPAR 
P E YDLVC I GLTGSGKTS LLS KLCS ESPDNWS TTG FS I KAVP FQ 
NAI LNVKELGGADN I RKYWSR YYQGSQG VI PVLDSASSEDDLEA 
ARN * S CTQI*IjQHPQL,CTLPFIi I LA 


7059 


1 


1178 


WPAFPRQPAAAAMDALLGTGPRRARGCLGAAGPTSSGRAARTPA" 

APWARFSAWLECVCWTFDLELGQALELVYPNDFRLTDKEKSSI 

CYLSFPDSHSGCLGDTQFSFRMRQCGGQRSPWHADDRHYNSRAP 

VALQREPAHYFGYVYFRQVKDSSVKRGYFQKSLVLVSRLPFVRI, 

FQALLSLIAPEYFDKLAPCLEAVCSEIDQWPAPAPGQTLNLPVM 

G WVQVR I PSRVDKSE S S PPKQFDQENLLPAP WLAS VHELDLF 

RCFRPVLTHMQTXWELMLLGEPLLVLAP^ 

QPLRFCCDFRPYFTIHDSEFKEFTTRTQAPPNWLGVTNPFFIK 
TLQHWPHIIiRVGEPKMSGDLPKQVKLKKPFKV*RPWDTKP 


7060 


90 


1670 


S VNIiP PS IjWP WEEAMDS TKS E PLKGS PEAEDGN I EY KKL VNPSQ 
YRFEHIiVTQMKWRLQEGRG EAVYQ IGVEDNGltLVGLAEEEMRAS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino arid 
sequence 


Amino acid segment containing signal peptide 
[A=Alanine ( C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine» G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=»Tyrosine, Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKTLHRMAEKVGADITVLRERETOYDSDMPRKITEVLVRKVPDN"" 
QQFLDLRVAVLGNVDSGKSTLLGVLTQGELDNGRGRARLNLFRH 
LHE I QSGRTSS ISFE I LGFNS KGEVHG INGTQWGQTLRMGW * + * 
k x uij»jK v w KLtc c. J. V * MjNALiKGIj* 1 isS APLRKSMGNQLN* I KNG 
VKI KRQGHPGNGLG PGNS EG VGRAGRRH * G P WALGQ WNYSDSR 
TAEE ICESS S KMI TFIDLAGHHKYLHTT I FGLTS YCPDCALLLV 
S ANTG I AGTTREHLGliALALKVPFFI WS KI DLCAKTTVERTVR 
QLBRVXjKQPGCHKVPMIjVTSEDDAVTAAQQ FAQS PNVTPI FTLS 
SVSGESLDLLKVFLNILPPIiTNSKEQEELMQQLTEFQVDEIYTV 
PEVGTWGGTLSR*IDLIiATLPTQPSPIYSKTSWPKGGDPGI 


7061 


364 


710 


ARMPSPLGPPCLPVMDPETTLEEPETARLRFRGFCYQEVAGPRE 
ALARLRELCCQWLQPEAHSKEQMLEMLVL.EQFLGTL.PPEIQAWV 
RGQR PGS PEEAAAL VEGLQHDP +ARMPS PLG PPCLPVMDPETTL 
EE PETARXRFRGFCYQEVAG PREALARLRELCCQWLQ P EAHS KE 
QMLEML VLEQFLGTIiP PE IQAWVRGQRPGS P EEAAAli VEGLQHD 
PGQLLG 


7062 


71 


744 


AKAGTNLERLHWL.SYFFCIPKHKLKSSQKDKVRQFMACTQAGER 
TA I YCLTQNE WRLDEATDS FFQNPDS LHRES MRNAVD KKKLE RI» 
YGRYKDPQDENKIGVDGrQQFCDDLSLDPASISVLVIAWKFRAA 
TQCEFSRKEFLDGMTELGCDSMEKLKALLPRLEQEIiKDTAKFKD 
F YQFTFTFAKNPGQKGLDI* * MAGAYWKI/V.LSGRFKFL YLWNTFIj 
MEHH 


7063 


2 


562 


LRTVPDLPGRRFRAMRTGQRR * PELPPDMNSLEQAEDLKAFERR 
LTEYIHCLQPATGRWRMLLIWSVCTATGAWNVILIDPETQKVSF 
FTSLWNHPFFTISCITLIGLFFAGIHKRVVAPS I IAARCRTVIiA 
EYNMSCDDTGKLILKPRPHVQ+QSSDIVMGLKIAFLRISDTAKS 
HKGFLIiRLDM 


7064 


300 


an j 

Oo4 


RDTGSDPSSfRkLCSTCCTGH*PAEPIASPHPSRGTCPPASSAS 
S RRTG CWTC P P E S GHAQARRSRRAS AS R WGARGAVRS AVAARGC 
SSRAGRWLETPGRRRGP PACAAAAGRLRGPAP * AAPPTASVPAR 
CRCPAARTGAPAAATWLRRRLSGLRAPALGRRRS PGPSPKSAAP 
PLLTPLGAGRAGGSRANS 


7065 


1 


555 


ATTTHSARRSGRGAAAEAAASAAGGRQKGPDRKAWEGRRTTPGG 
RSQSEPKAPPPQKRSEAAFASMAHSPVAVQVPGMQNNIADPEEL 
r i xujjsrcxvsjbwdr vCV JCMvOlL/WKlvW " iULit»c»ABDEIEDI 
QQEITVLSQCDSSYVTKYYGSYLKGSKIjWIIMEYJU5GGSAIiDIiL 
RAGPFDEFQ 


7066 


356 


676 


PGPQRGPWRAREGGHPIjDPADHPRAPASIiRSNVRAATMMQICDT 
YNQKHSLFNAMNRFIGAVNNMDQTVMVPSLLRDVPIiADPGIjDND 
vu V Ei vooolj K X JrF 


7067 


152 


973 


KENITMATEIGSPPRFFHMPRFQHQAPRQLFYKRPDFAQQQAMQ 
QLTFDGKRMRKAVNRKTIDYNPSVIKYLENRIWQRDQRDMRAIQ 
PDAGY YNDIj VP P I GMLNN PMNAVTTKFVRTS TNKVKCP VF WRW 

tpegrrlvtgassgeftiiwngltfnfetilqahds pvramtwsh 
ndmwmLtadhggyvkywqsnmnnvkmfqahkeairearfihnip 

FSWPIVMVKLFSKClLGAEMHGLCQFI^NFlkHPINTIFFFVFT 
HSPFCWAPF 


7068 


222 


816 


DTMKEYVLIjIiFIJUjCSAKPFFSPSHIA1»KNMMLKDMEDTDDDDD 

ddddddddddbdnslfptreprshffpfdlfpmcpfgcqcysrv 

VHCSDLGLTSVPTN I PFDTRMLDLQNNKI KB I KENDFKGLTSLY 
GLILNNNIO^TKIHPKAFLTTKKLRRIjYIjSHNQIiSEIPLNLPKSL 
AELR IHENKVKKI Q KDTFKKK 


7069 


114 7 


1765 


FRDHRRYFYVNEQSGESQWEFPDGEEEEEESQAQENRDETIiAKQ 

tlkdktgtdsnstessetstgslckesfsgqvsssslmpltpfw 

TLLQSNVPVIjQPPLPLEMPPPPPPPPESPPPPPPPPPAPKMPPP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








BKTKKGRKDKAKKSKTKMPSLVKKWQSIQRBLDKKDNSSSSEED 
RVS TAQKR I E E WKQQQI»VSGMAERNANFE A 


7070 


1 


547 


DGTMEDSEAVQRATAL I EQRLAQEEENE KLRGDARQKLPMDLLV 
LEDEKHHGAQS AA LQKVKGQER VRKTSLDLRR E 1 1 DVGG IQNLI 
ELRKKRKQKKRDAIAASHEPPPEPEEITGPVDEETFLKAAVEGK 
MKVIEKFLADGGSADTCDQFRRTALHRASLEGHMEILEKLLDNG 
ATVDFQ 


7071 


2 


921 


ARGTLRALETAKKVGKVGANGQKAAGPSADS VTENK IGS PPKTP 
VSNVAATS AGP tlMVRTFT.M QVPOtcq Q Oct .TOAr u zv vddu o i?xt r rw 

FQDPRTQIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSNNV 
PESSI.PPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 
SLRERYNSLDGYYSVACOPPSEPRTTVPr.PRFPr , nTTT.Tf , pc:r , i?T?r» 
IRR KPDQWAQYHTQKAPLVS STL P VATQS PT PPSTLNRGEGS 


7072 


2 


921 


ARGTLRALETAKKVGKVGANGQKAAG PSADS VTENKI GS PPKTP 
VSNVAATS AG PSNVGTELNS VPQKS S PFLTR V PAYPPHSBN IQ Y 
FQDP RTQ I PFE VPQ YPQTG Y Y P P P PTVPAGVAPC VPRF VRSNNV 

PESSLPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAPWD<3PP TUP D DMVfiPnnT T D CMCT.DDMMmmc pin/ni» 

SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
IPJRKPDQWAQ YHTQKAPLVS STLP VATQS PTPPSTLNRGEGS 


7073 


50 


504 


LAHGS FG VSDFPAPAAAP AHTLTS FSGSLS PQFRKPLGRAPAM P 

LVRYRKTA/TT/5YPf^f^TCT , QT.ZlHr>T7\ri?(^PT?GT?r*vr4DTnrirN7 r i»vetrT 
«-* v ****v*\. v v j.jjo j. ivv.. v vjxvioxittri^r VE»V?CtIT oCv? lUr 1 V avi X I o K.X 

VTLGKDEFHLHLVDTAGQDEYSILPYSFI IGVHGYVLVYSVTSL 
HSFQVI ESLYQKLHEGHGK 


7074 


263 


1003 


VCP VLCSTRQEPGHS S LVTYFG KPTRR KEFLLGHCI AAGKMMI S 
VDLETN YAELVLDVGRVTLG ENSRKKMKDCKLRKKQNERVSRAM 
CALLNSGGGVI KAEIENEDYS YTKDGIGLDLENS FSNILLFVPE 
YLD FMONGNYFTi T W T.NT*?^ T .P T TTT eusTT . V JTD nTTC Jk v\7 

MNATAALEFLKDMKKTRGRLYLRPELLAKRPRVDIQEENNMKAL 
AGVFFDRTELDRKEKLTFTESTHVEI 


7075 


598 


1005 


NYINFFFRKEYPPHVQKVEINPVRLSRLQGVERIMKKTEESESQ " 
VEPEI KRKVQQKRHCS T YQ PTPPLSPAS KKCLTHLEDLQRNCRQ 
Al TLNESTG PLLRTS IHQNSGGQKSQNTGLTTKKF YGNNVEKVP 
IDII 


7076 


279 


1049 


LQSESSNAAEGNEQRHEDEQRSKRGGWS KGRKRKKPLRDSNAPK ' 
S PLTG YVR FMNERREQLRAKRPEVPFP E I TRMLGNEWS KLP PEE 
KQRYliDEADRDKERYMKELEQYQKTEAYKVFSRKTQDRQKGKSH 
RQDAARQATHDHEKETEVKERSVFDIPI FTEEFLNHSKAREAEL 
RQLRKSNMEFEERSIAAIjQKHVESMRTAVE KLEVD VIQERSRNTV 
LQQHLETLRQVLTSSFASMPLPEXGETPTVDTIDSYM 


7077 


3 


1119 


S SMGSNSE I NGLALRKTDKYGFLGGSQYSGS LKSS I P VDVARQR 
ELKWLDMFSNWDKWLSRRFQKVKLRCRKG I PSSLRAKAWQYLSN 
SKELLEQNPRKFEELERAPGDPKWLDVIEKDLHRQFPFHEMFAA 
RGGHGQQDLYRILKAYTIYRPDEGYCQAQAPVAAVLLMHMPAEQ 
AFWCLVQI CDKYLPG Y YSAGLEAI QLDGB I FFALLRRAS PLAHR 
RLRRQR I DPVLYMTEW FMCI FARTLPWAS VLRVWDMFFCEGVK I 
I FRVALVLLRHTLGSVEKLRS CQGMYETMEQLRNLPQQCMQEDF 
LVHEVTl^P VTEALI ERENAAQLKKWP^TRGELQYRPSRRLHGS 
RAI HEERRRQQ P PLGP SSS 


7078 


483 


767 


FQGQRMAGEQKPSSNLLEQFILLAKGTSGSALTALISQVLEAPG 
VYVFGE LLELANVQ ELAEGANAA YLQ LLNLFAYGT YPD Y I ANKE 
SLPELY 


7079 


2 


376" 


SWEFKRPKEPSGSDGESDGPIDVGQEGQLSQMARPLSTPSSSQ 
MQARKKRRG 1 1 EKRRRDRINSS LSELRRX/VPTAFEKQGS SKLEK 



594 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, C=Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AEVLQMTVDHLKMLHATGGTGTHALLFQASFIQQIF "~" ' 


7080 


200 


595 


VQLPLEAPCLSLLSCRDHSGGNRDLSRRHRDCRVYGSPQDGIPY " 
LTHPLCHODWSVGRLOIRAIATPGHTQGHLVYLLDGEPYKGPS 
CLFSGDLLFLSGCGEFPRKREElrGEEGETEVRAATVPWRALKP " 


7081 


213 


506 


AVTEEEMILNSLSLCYHNKLILAPMVRVGTbPMRLliALDYGTADI 
VYCEELIDIiKMIQCKRWNEVLSTVDFVAPDDRWFRTCEREQN 
RWFQMGTS 


7082 


3 


1137 


CCRSS PRDLRDGEREHE AAQRKAPGAES CPSLPLS ISDIGTGCL 
SSLENLRLPTLREESSPRELEDSSGDQGRCGPTHQGSEDPSMLS 
QAQS ATE VEERHVS PS CSTSRERPFQAGEL ILAETGEGETKFKK 

IiFRLNNFGXjTiNSNWfiAVPPrJVTV^WCDfSOTT D c? ej tr-'T/vx vimr nnn 
iNwi'tvjwiiiion nunV tr " Ol\l VVjr|\.r ir\j\J J. Lj>C£>0 C (jlvy X MLirirtir 

ALEDYWLMKRGTAITFPKDINMILSMMDINPGIXrVLEAGSGSG 
GMSLFLSKAVGSQGRVISFEVRKDHHDIiAKKNYKHWRDSWKLSH 
VEEWPDNVDFI HKDISGATEDI KSLTFDAVALDMLNPH VTLPVF 
YPHLKHGGVC P V YWN I TQV I EI*LD 


7083 


115 


541 


RSNAVQLTRMEYAMKSliSLLYPKSLSRHVSVRTSWTQQLLSEP 
SPKAPRARPCRVSTADRSVRKGIMAYSLEJDLLLKVRDTLMLADK 
PFPLVLEEDGTTVETEEYFQAIiAGDTVFMVLQKGQKWQPPSEQG 
TRHPLSLSHK 


7084 
• 


3 


522 


NSVS VSS QSR FLAS VPGTGVQRSAAADMAAS TAAG KQR I PKVAK 
VKNKAP/AEVQ ITAEQLI*REAKKRELEt»L»P PPPQQKITDEEELND 
YKLRKRKTFEDNIRKNRTVISNWIKYAQWEESLKEIQRARSIYE 
RALDVDYRN I TLWLKYAEMEMKNRQVNHARNI WDRAITTL 


7085 


243 


1499 


«V0amcijt<2CKL»WKfa^£U(jAk , MAHJ riNQxXiQQVYEAIDSRDGASC 
AELVS FKH PHVANPRLQMAS PE EKCQQVLEP P YDEMFAAHLRCT 
YAVGNHDFIEAYKCQTVIVQSFLRAFQAHKEENWAIjPVMYAVAL 
DLRVFANNADG^LVKKGKSKVGDMliEKAAEIiLMSCFRV(^DTR 
AGIEDSKKWGMLFLVNQIiFKIYFKINKLHI>CKPLIRAIDSSNLK 
DDYS TAORVT YKY YVGRKAM PDSD VK na PP VT . C Pa T?T?vniiT> c? en 

KNKRMILIYXLPVKMLI^HMPTVELLKKyHI^QFAEVT'RAVSEG 
MLLLLHEALAKHEAFFI RCG I FL I I»E KUKI IT YRNLFKK VYLIiI* 
KTHQLS LDAFL VALKFMQVEDVD I DE VQCII*AN1»I YMGHVKGY I 
SHQHQKLWSKQNPFPPLSTGC 


7086 


256 


525 


XLAARMGKQNS KLRPEVMODIiLESTDFTKHR TOFTWYTo^Fr.pnr-D ' 

SGHLSMEEFKKI YGNFFP YGDASKFAEHVFRTFDANGDGT I DFR 
EF 


7087 


166 


723 


LSGSSAGKVAAPCVPPSNHELVPITTENAPKNWDKGEGASRGG 
NTRKSLEDNGSTRVTPSVQPHLQPIRNMSVSRTMEDSCELDLVY 
VTERI I AVS FPS TANEENFRSNLRE VAQMIiKS KHGGNYIiIi FfTLS 
ERRPD I TKLHAKVLEFG W PDIjHTPAIiEKI CS I CKAMDTWIjNAHP 
HRCRVLHNKG 


7088 


104 


759 


GTSAASPSSLLEMAGEITETGELYSSYVGLVYMFNLIVGTGALT 
M P KA FAT AG WL VS L. VLLVFLG FMS FMTTTFV I E AMAAANAQLHW 
KRMENLKEEEDDDSSTASDSDVLIRDNYERAEKRPILSVQRRGS 
PNPFE ITDRVEMGQMASMFFNKVGVNLFYFCI I VYL YGDLAI YA 
AAVPFSLMQVTCS ATGNDS CG VEADTK YNDTDRCWG PLRRVD 


7089 


33 


1775 


S VC WEDRYLKARMEES PLSRAPSRGG VNFLNVART YI PNTKVEC 
HYTLPPGTMPSASDWIGIFKVEAACVRDYHTFVWSSVPESTTDG 
S P IHTS VQFQAS YLPKPGAQLYQFRYVNRQGQVCGQS P P FQFRE 
PRPMDEL VTLEE ADGGSD I LLVVPKATVLQNQLDE SQQERNDLM 
QLKLQLEGQVTELRSRVQELERALATARQEHTELMEQYKGISRS 
HGE I TEERDI LSRQQGDHVAR I LELEDD IQT I SEKVI/TKE VELD 
RLRDTVKALTREQE KLLGQLKE VQADKEQS EAELQVAQQENHHI* 
NLDLKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAQQRVAELEP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

-I- W IwWUJL VX1 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=K3lycine, 
H~Histidme, I=Isoleucine, K= Lysine, 
L^Leucine , M=Methionine , N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, Ts=Threonine, v= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /=poesible nucleotide deletion, 
\spossible nucleotide insertion) 








LKEQLRGAQELAASSQQKATLLGEEIiASAAAARDRTlAELHRSR 
LEVAEVNGKIAELGLHLKEEKCQWSKERAGLLQSVEAEKDKIIiK 
IiSAE I LRliEKAVQEERTQNQVFKTEIAREKDSSLVQLSESKREfi 
TBLRSAI^RVLQKEKEQI^EEKQELLEYMRKXEARLEKVADEKWN 
EDATTEDE EAAVGLS CPAALTDS EDES PEDMRLHPMAF VSVETQ 
ASLLLGLE 


7030 


33 


1775 


SVCWEDRYLKARMEESPLSRAPSRGGVNFLNVARTYIPNTKVEC 
H YTL P PGTMP S AS DW I G I FKVE AACVRD YHT FVWSS VPESTTDG 
SPIHTSVQFQASYIiPKPGAQLYQFRYVNRQGQVCGQSPPFQFRE 
PRPMDELVTLEEADGGSDILLWPKATVLQNQLDESQQERNDLM 
QLKLQLEGQVTELRSRVQELERALATARQEHTELMEQYKGISRS 
HGE I TEERD I LSRQQGDHVAR I LELEDDI QTI SEKVLTXEVELD 
RLRDTVKALTREQEKLLGQLKEVQADKEQSEAELQVAQQENHHIi 
NliDIiKEAKSWQEEQSAQAORLKDKVAQMKDTLGQAQQRVAEI/EP 
LKEQLRGAQEIiAAS SQQKATLLGE EIiASAAAARDRTI AELHRSR 
LEVAEVNGKLAEIjGLHLKEEKCQWSKERAGLLQSVEAEKDKILK 
LS AE I LR1.EKAVQEERTQNQV FKTELAREKDS SLVQL»S ES KREL 
TEIiRSAliRVLQKEKEQLQEEKQELLEYMRKLEARLEKVADEKMN 
EDATTEDEEAAVGLS CPAALTDSEDES PEDMRLHPMAFVS VETQ 


7091 


186 


1076 


EGMIiTREHRCGRS EEQEIjE P WPS P KKARSGRWLRNGFKRKMEEP 
EEPADSGQSLVPVYIYSPEYVSMCDSLAKIPKRASMVHSLIEAY 
ALHKQMRI VKP KVASMEEMATFHTDAYIjQHI*Q KVSQEGDDDHPD 
s i e yglg ydcpateg i fd yaaa3 gg at i taaqcli dgm ckvain 
WSGGWHHAKKDEASGFCYLNDAVLGIIjRLRRKFERILYVDIjDIiH 

hgdgvedafs ftskvmtvs lhkfs pgffpgtgdvsdvglgkgr y 
ysvnvpiqdgiqdexyyqicbryeppapnpgl 


7092 


522 


809 


KQG INEDQEESQKPRIjGEGCEPI S KRQMKKL I KQKQWEEQRELR 
KQKRKEKRKRKKIiERQCQMEPKSDGHDRKRVRRDVVHSTIjRLI I 
DCSFDXLM 


7093 


454 


655 


NFGVSGVELAQQASMVRMS fviaacqlvlgllmtsltess iqns 
ecpqlcvceirpwftpqstyrea 


7094 


2 


508 


fvrsmhwgvg FASS RPCWDLSWNQS I SFFGWWAGSEEPFSFYG 
DI XAFPLQDYGG IMAGLGSDP WWKKTLYLTGGAI^AAAAYLLHE 
LLVIRKQQEIDS KDAI I LHQFARPNNGVPSLS P FCLKMET YLRM 
ADLPYQNYFGGKLSAQG KMPW I EYNHE KVSGTE F 1 1 


7095 


1 


411 


IASSIiPKMASIjIXJSDRVLYLVQGEKKVRAPLSQIiYFCRYCSELR 
SLEOTSHEVDSHYCPSCLENMPSAEAKLKKNRCANCFDCPGCMH 
TLSTRATSISTQLPDDPAKTTMKKAYYtACGFCRWTSRDVGMAD 
KSVGE 


7096 


224 


2067 


ETRSIiAVQEKPSQAGRRRSSRISFAGALFLTRFLLQELLLNNFC 
bAMS PAPDAAPAPAS I S LFDLSADAP VFQGIjSLVS HAPGEAIiAR 
APRTS CSGSGERESPERKLLQGPMD1 SEKLFCSTCDQTFQNHQE 
QREH YKIiDWHRFNLKQRIiKDKPLLSAliD FEKQS S TGDLSS I SGS 
EDSDSASEEDI^TI*DRERATFEKI.SRPPGFYPHRVLFQNAOGQF 
L YAYRCVLG PHQDPPEEAELLLQNLQS KGPRDCVVLMAAAGHFA 
GAI FQGRE WTHKTFHR YTVRAKRGTAQGLRDARGGPSHSAGAN 
LRRYWEATTiYKDVRDLLAGPSWAKAliEEAGTILLRAPRSGRSLF 
FGGKGAPLQRGDPRLWDI PLATRRPTFQELQRVLHKLTTLHVYE 
EDPREAVRLHSPQTHWKTVREERKKPTEEEIRKI CRDEKEALGQ 
NEESPKQGSGSEGEIX3FQVELELVELTVGTLDLCESEVLPKRRR 
RKRNKKEKSRDQEAGAHRTLLQOTQEEEPSTQSSQAVAAPIiGPL 
LDEAKAPGQPELWNALLAACRAGDVGVLKLQLAPS PADPRVLSL 
LS APIK3 S GGF TLLHAAAAAGRGS VVRLLL E AGAD PTVQCQDH 


7097 


256 


1228 


IRTKSAATWEAWPQCGREGSRI ITEPCEANAGSRQELQTERI SS 
FLAAQGDQAFHSGIiETNNSNS ELPLRVGLKVAQGS PLMGGQ VS A 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" " 
<A=Alanine, (^Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=L»ysine, 
Ij=l»eucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R^Arginine, 
SrSerine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*=possible nucleotide insertion) 








SNSFSRLHCRNANEDWMSALCPRLWDVPLHHbSIPGSHDTMTYC 
LNKKS P I SH E ES RLl»QIiLNKAl»PCI TRPVVLKWS VTQAL DVTE Q 
LDAGVRyi»DLRIAHMLEGSEKNLHFVHMVYTTALVEDTI,TEISE 
WLERHPREWILACRNPEGLSEDLHEYLVACIKNIFGDMLCPRG 
EVPTLRQLW SRGQQV I VS YEDE S S LRRHHELWPGVP YW WGNRVK 
TEAI>IRYLETMKSCGR 


7098 
7099 


82 


956 


SSFLKRCRKVI^CWGIPSEQSLFSTLEEPRDKEIDNYCVMRU3T " 
EARSGFWAPjntFPVNICRMTAVDGDRGGSSRETCRCHFHPSljEA 
IiVLIiLQDWQPGGVG I CT S FLG I S W ALLDYHRAL RTCL PS KPLLG 
LGSSVIYFLWNLLLLWPRVLAVAIiFSALFPSYVALHFLGLWLVL 
LLWWLQGTDFMPDPSSEWLYRVTVATILYFSWFNVAEGRTRGR 
AI IHFAFLLSDS ILLVATWVTHSSWLPSGI PLQLWLPVGCGCFF 
LGLALRLVY YHWIjHP S CCWKPDPDQ VD 


7100 


992 


210 


LFRIiAPGFLRSliARQGYHQIWAFPFLPSGATATWPAASRSRSLA 
ARSIjPRSPARPGPNDALLGEHDFRGQGVRAQRFRFSEEPGPGAD 
GAVLEVHVPQIGAGVSLPG I LAAKCGAEVI LSDSS ETLPH CLE VC 
RQSCQMNNLPHLQ WGLT WGHI S WDLLALPPQD 1 I UVSDVFFEP 
EDFED I LATI YFLMH KN PKVQLWS T YQVRS ADWSL EALI> YKWDM 
KCVHIPLESFDADKEDIAESTLPGRHTVEMLVISFAKDSL 




205 


671 


ANGG FWEAAPGSEVSZ.PL WVPTASHSKTTALG igsappphlsvl 
FLFS FP PQLGD PI»E AFP VFKKYDRNGLNVS 3 ECKRVSGLE PATV 
DWAFDLTKTlWQTMYEQSEWGWKDREKREEm'DDRAWYLTAWPN 
SS VP VAFSHFR FDVERGDEVLYW 


7101 
7102 


2 


503 


WRGGPRRAKRI»AGGAVGW VLLVRGVHS VRAGGGRPPRAADMKKD 
VRILLVGEPRVGKTSLIMSLVSEEFPEEVPPRAEEITIPADVTP 
ERVPTHIVBYSEAEQSDEQLHQEISQANVICIVYAVNNKHSIDK 
VTSR W I PLINERTDKDSRLPIiI 1A3GNKS DLVE YSR 




2 


503 


WRGGPRRAKRIAGGAVGWVLL.VRGVHSVRAGGGRPPRAADMKKD 
VR II>I»VGEPR VGKTSLIMSLVSEEFPEEVPPRAEE I Tl P ADVTP 
ER V PTH I VDYSEAEQSDEQLHQE I S QANVI CI VYAVNNKHS I DK 
VTSRWIPLIKERTDKDSRLPLILGGKKSDLVEYSR 


7103 


119 


438 


GSQSSVAVNIRSGTDEESMDLMKGQASSVNIAATASEKSSSSES 
LSDKGSELKKSFDAVVFDVLKVTPEEYAGQITLMDVPVFKAIQP 
DELSS CGWNKKEKYS SAP 


7104 
7105 


1670 


795 


RLWEHRSVSAGASGWGLSSPGCIiljIiHPSLPEEERVDILINWAGV 
MRCPHWTTEDGFEMQFGVWHLGEAWAGAAPWVQAILPRRPPKVI, 
GF*V*VKSDLFIILNPGHFLLTNLLLDKLKASAPSRIINLSSLA 
HVAGH I DFDDLNWQTRKYNTKAAYCQS \ Kt»AI VLFTKELS RRLQ 
GSGVTVNALHPG VARTELGRHTG IHGS TFLQUHN \ WAHLLAAW S 
KSPRSWPAPAQHNTIAVAEELAWISGKYFDGLKQKAPAPEAED 
EEVARRLWAESARliVGLEAPSVREQPLPR 


7106 


765 


143 


GQMCRRPSPKSTSCLSMTCDLP/RGLQDPQCLAIjFRVAVDKHQA 
LLKAAMSGQGVDRHLFAL Y I VS RFLHLQ S P FLTQVHSEQ WQI»S T 
&y j. f vy y WHi>FDVHNYPDYVS SGGGFGPADDHGYGVS YI FMGDG 
MITFHISSKKSSTKTDSHRLGQHIEDALLDVASLFQAGQHFKRR 
FRGSGKENSRHRCGFLSRQTGASKASMTSTDF 


7107 j 


14 
1145 


1064 i 
591 


GLQAGH PH PRS ASR 1 PEADTH \ YSKLQRAFDS I VNKDHKRMFGT 
YFRVG FFGS KFGDLDEQEFVY KEPAI TKLPEISHRLEAFYGQC F 
GAEFVEVIKDSTPVDKTKLDPNKAYIQITFVEPYFDEYEMKDRV 
TYFEKNFNLRRFMYTTPFTLEGRPRGELHEQYRRNTVLTTMHAF 
P YI KTR I S VI QKEEFVLTP I E VAI EDMKKKTLQLAVAINQEPPD 
AKMLQ M VLQGS VGATVNQG P LE VAQV FLAE I PAD P KX» YRHHNKIi 
RliCPKEF I MRCGEAVEKNKRLI TADQRE YQQELKKNYNKLKBNI* 
RPMIERKI PELYKP IFRVESQKRDSFHRSSFRKCETQLSQGS 
* I * WLQTGKKK 

. 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown r *=Stop 
Codon, /^possible nucleotide deletion,. 
\=possible nucleotide insertion) 


7108 


1 


942 


VKVALLLTNLEQPRTESBWENSFTLKMFLFQFVNLNSSTFYIAF 
FIiGRFTGHPGAYLRLIWRWRLEECHPSGCLIDLCMQMGI I MVLK 
QTVmNFMEI^YPLIQNWWTRRKVRQEHGPBRKISFPQWEKDYNIi 
QPMNAYGLFDEYLEMILQFGFTTIFVAAFPIiAPLLALLNNIIEI 
RLDAYKFVTQWRRPLASRAKDIGIWYGILEG1GILSVITNAFVI 
AI TSDFI PRLVYAYKYGPCAGQGEAGQKCMVGYVNASIiSVFRI S 
DFENRSEPESDGSEFSGTPLKYCRYRDYRDPPHSLVPYGYTLQF 
WHVLAW 


7109 


964 


102 


WDQRKRNSLVPGPAHGPAQEEPWEKKESLGAAQEALSIQLQPKE 
TQP FPKSEQVYLHFLS WTEDGPEPKDKGSLPQPP ITEVESQVF 
SE KLATDTST FEATSEGTLELQQRN PKAERLRWS PAQEES FRQM 
WIHKEIPTGKKDHECSECGKTFIYNSHLWHQRVHSGEKPYKC 
SDCGKTFKQSSNXGQHQRIHTGEKPFECNECGKAFRWGAHLVQH 
QRIHSGEKPYECNECGKAFSQSSYIiSQHRRIHSGEKPFICKECG 
KAYGWCSELI RHRR VHARKEPSH 


7110 


96 


697 


RLDN FSG FLVEVTKEERH I VKPLiYDR YRLVKQM LTRAS I T PVLG ' 
SPSTKRRGQMLQPI IEGETAHFFEE I KEEEEDGVNLSSELGDML 
KTAVQVQSSLKWSESDVEENQEKLALDLRLSSSRAASMPELLEQ 
LWKARAEKKKLRKTLREFEEAFYQQNGRNAQKEDRVPVLEEYRE 
YKKI KAKLRIiLEVIilSKQDSSKSI 


7111 


2 


414 


GSGLYRGPTPGGQCIWKPNSMPPDHERNFGFTQFAIjEIiNEljTAE 
LKRSLPSTDTRLRPDQRYLEEGN I QAAEAQKRR I E QLQRDRRKV 
MEENNIVHQARFFT^RQTDSSGKEWWVT^TYWRLRAEPGYGNMD 
GAVLW 


7112 


103 


495 


PRCFPVADRGRLIGGLPDWTIMEGKTLNLTCTVFGNPDPEVIW 
FKNDQDIQLSEHFSVKVEQAKYVSMTIKGVTSEDSGKYSINIKN 
KYGGEKIDVTVS VYKHGEKI PDMAPPQQAKPKLI PASASAAGQ 


7113 


1 


824 


KCIiRQAWHEAPS S LAFTR WCSREERAEGGGNIiHRS ITRDPKPPG 
LRPSQRPMDDKKKKRSPKPCLAQPAQAPGTIjRRVPVPTSHSGSt* 
ALGLPHLPS F KQRAKFKRVGKEKGR P VLAGGGSGS AGTPLQHS F 
LTEVTDVYEMEGGLLNLLNDFHSGRLQAFGKECSFEQLEHVREM 
QEKLAR LHFS VCG EEEDDEEEEJDG VTEG LPEE0 KJCTMADRNL 
DQLLSNLGSCLGALVPGGMRGGEGTYSQSHSWALGEKVGVHGSK 
SSGPLNLPRR 


7114 


3 


14 92 


VWEVDEQIDHYKESQDKFLWQAAFIGKETLKDESGQECKICRKI 
IYLNTDFVSVKQRL.PKYYSWERCSKHHLNFLGQNRSYVRKKDDG 
CKAYWKVCLHYNLHKAQPAERFFDPNQRGKALHQKQAJLRKSQRS 
QTIGEKLYKCrrECGKVFIQKANLVVHQRTHTGEKPYECCECAKAF 
SQKSTL IAHQRTHTGEKPYECSECGKTFIQKSTLI KHQRTHTGE 
KPFVCDKCPKAFKSSYHLIRHBKTHIRQAFYKGIKCTTSSLIYQ 
RIHTSEKPQCSEHGKASDEKPSPTKHWRTHTKENIYECSKCGKS 
FRGKSHLS VHQR I HTGEKP YECSXCGKTFSGKSHLS VHHRTHTG 
EKPYECRRCGKAFGEKSTLIVHQRMHTGEKPYKCNECGKAFSEK 
SPLIKHQRIHTGERPYECTDCiCKAFSRKSTIiIKHQRIHrGEKPY 
KCSECGKAFSVKSTL»IVHHRTHTGEKPYECRDCGKAFSGKST1*I 
KHQRSHTGDKNL 


7115 


1 


947 


NAAHG YNWGLWCM Y 1 1 PPQDWLDRGDESAPI RTPAMI GCS F WD 
REYFGDIGLLDPGMEVYGGENVKLGMRVWQCGGSMEVLPCSRVA 
HIERTRKPYNNDIDYYAKRNALRAAEVWMDDFKSHVYMAWNIPM 
SNPGVDFGDVS ERIALRQRLKCRS FKWYLENVYPEMRVYNNTLT 
YGEVRNSKASAYCX.DQGAEDGDRAILYPCHGMSSQLVRYSADGL 
LQLGPLGS TAFliPDS KC1»VDDGTGRMPTLKKCEDVARPTQRI>WD 
PTQSGP I VSRATGRCLEVEMS KDANFGLRLWQRCS GQKWM I RN 
WIKHARH 


7116 


866 


95 


RVRMRRNAEVIEEKLSMKSWAKFRPGEPWKGYPNIDPETDPYVT 
PGSVINNIiSIimmEVDHLRDRNSGSSSSLNTTLPSTSAWSSIR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide "~ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
I>=i>eucine, M=Methionine, N^Asparagine, 
P= Proline, Q=Glut amine , R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ASNYNVPLSSTAQSTSARNSDSKLTWSPGSVTNTSUVHELWKVP 
LPPKNITAPSRPPPGLTGQKPPLSTWDNSPLRIGGGWGNSDARY 
TPGSSWGBSSSGRITNWLVLKNLTPQIDGSTLRTLCMQHGPLIT 
FHLNLPHGNALVRYSSXEE VVKAQKSLHI SDLPLLTL 


j 7117 


695 


1261 


LLI STPGGCH PPPSS IE FTYTGAWGKALPAPHMPCAPGALPQGA 
FVSOAARAI PLLQPSQAAQAEGLSQPARACGALCSLPWPLRNWG 
S P I LRIiPGGLRT PTNDRKTRTRS AMACWARAQ WDTLG PLKLSHR 
GKVCLRHPRPTGVRGGPGAAGRQGGMGTRRRGTFTSGARDPGGL 
RVKHRCQPTGHLP 


7118 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYAI.LALKEVEEISLLQPQVE "" 
ES VLNLGKFHS I VR LVAFCPFAS S OVAIiENANJVVQ Prwu^nT r? 

LLLETHLPSKKKKVLLGVGDPKIGAAIQEELGYNCQTGGVIAEI 
LRGVRLHFHNLVKGLTDLSACKAQLGIjGHSYSRAKVKFNVNRVD 
NMI IQS ISLLDQLDKDINTFSMRVREWYGYHFPELVKI INDNAT 
YCRLAQFIGNRREIiNEDKIjEKLEEIiTMDGAKAKAlLDASRSSMG 
MDI SAIDI>INI ES FSSRWSLSEYRQSLHTYLRSKMSQVAPSIjS 
AL> I GEAVG ARIi I AHAGS IiTNLAK Y P AS TVQ I LGAEKALFRALKT 
RGNTPKYGLI FHSTFIGRAAAKNKGRISRYLANKCS IASRIDCF 
SEVPTS VFGEKXiREQVEERLSFYETGEI prknldvmkbamvqae 
EAAAEITRKLEKQEKKRLKKEKKRLAAUVLASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEL 
MS SDLEETAGSTS I P KR KKSTPKEETVNDPEE AGHRSGS KKKRK 
FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7119 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALIALKEVEEISLLQPQVE" 
ESVLNLGKFHSTVRrjVAFr*PPa^cinvaT,P7a2VWTVVOirOTn7Tii?T^T r> 

LI^ETHLPS KKKKVLLGVGDP KIGAAIQEELGYNCQTGGVI AE I 
LRGVRLHFHNLVKGLTDLSACKAQLGLGHSYSRAXVKFNVNRVD 
NMI IQSISLLDQLDKDINTFSMRVREWYGYHFPELVKI INDNAT 
YCRIAQFIGNRRELNEDKIiEKLEELTMDGAKAKAILDASRSSMG 
MDISAIDLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAPSLS 
AL I GEAVGARL I AHAG S LTNLAKY PAS TVQ I LG AE KAL FRALKT 
RGNTP KYGLI FHSTFIGRAAAKNKGR ISRYLANKCS I ASR I DCF 
SEVPTSVFGEKLREQVBERLS FYETGEI PRKNLDVMKEAMVOAE 
EAAAEITRKLEKQEKKRLKKEKKRLAAIALASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEI* 
MS SDLEETAGSTS I PKRKKS TP KEETVNDPEEAGHRSGS KKKRK 
FSKEEP VS SG PEEAAGKS S S KKKKKFHKASQED 


7120 


1991 


64 


QLGTRRCLRGDKVTNAMQ D FL.VTNLE PRF I E PQTANLSWFKDS 
NSTTPIil FVLSPGTDPAADIiYKFAEEMKFSKKLSAISLGQGQG P 
RAEAMMRSS IERGKW VFFQNCHLAPSWMPALERIi I EHINPDKVH 
RDFRLWLTSLPSNKFPVSILQNGSKMTIEPPRGVRANIXKSYSS 
LGEDFLNSCHKVMEFKSLLLSLCLFHGNALERRKFGPLGFNIPY 
EFTDGDLRICISQLKMFLDE YDDI PYKVLKYTAGE INYGGRVTD 
DWDRRC IMNI LEDFYNPDVIiSPEHS YS ASGI YHQ I PPTYDLHG Y 
LSYIKSLPLNDMPEI FGIiHDNAN ITFAQNETFALLGTI IQLQPK 
SSSAGS QGREEI VEDVTQNI LLKVPEP INLQWVMAKYP VL YEES 
MNTVLVQEVI RYNRLLQ VITQTLQDLLKAI1KGL VVMS SQLELMA 
ASLYlflNTVPELWSAKAYPSLKPLSSWVMDLLQRLDFLQAWIQDG 
I PAVFW ISGFFFPQAFLTGTLQNFARKFVISIDTI SFDFKVMFE 
APSELTQRPOVG CY I HGIiFLEGARWDPEAFQLAE S QPKEL YTEM 
AVI WLIjPTPNRKAQDQDF YLC P I YTCTLTRAGTLSTTGHSTNYVI 
AVE I PTHQPQRH W I KRGVAL I CALDY 


7121 


2 


546 


RPLRPWVLSIiGSMVGLMTYGRRQFQSLDTTMRRL I PPFREASAK " 
LTTLVDADAEAFTAYLEAMRLPKNTPEE KDRRTAALQEGLRRAV 
SVPLTIAETVASLWPAI^EIkARC^NIACRSDI^VAAKALEMGVF 
GAYFNVIiINLRDITDEAFKr>QIHHRVSSLIX2EAKTQAALVL 
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ID 
NO: 
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beginning 
nucleotide 
location 
corresponding 
to first - 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl al anine , G=Glyclne, 
H=Histidine, I=Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V= Valine, 
W= Tryptophan , Y= Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








BTRQE 


7122 


2 


546 


RPLRPWVLSI^GSMVGLMTYGRRQFQSLDTTMRRIjIPPFREASAK 

SVPLTIAETVASLWPALQEIARCGNIACRSDLQVAAKALEMGVF 
GAYFNVLI NLRD I TDEAFKDQI HHRVS SLIiQEAKTQAAIjVLDCL 
ETRQE 


7123 


1 


1092 


KPAVPEARSAGTSEAGRSGAEEVS CGS VSGDGAAMRLTP RALCS 
AAQAAWRE N Fp I»CG R DVARWFPGHMAKGLKKMQS S LKLVDCI I E 
VHDARIPIjSGRNPLFQETIjGLKPHLX»VLNKMD1JU!)L.TEQQKIMQ 
HLEGEGLKNVI FTNCVKDENVKQ 1 1 PMVTELIGRSHRYHRKEN1> 
EYCIMVIGVPNVGKSSLINSLRRQHLRKGKATRVGGEPGITRAV 
MSKI QVSERPLMFLLDTPGVLAPR I ESVETGLKLAIXGTVLDHL 
VGBETMADYLLYTLNKHQRFGYVQHYGLGSACDNVERVIiKSVAV 
klg ktq kv kvltgtgnvnv i qpn y paaard flqt frrgl iig s vm 
LDLDVLRGHPRV 


7124 


2 


382 


LPLTLIiLAAP FAHbliLPPGHDQSPCWHPGPALS PGTLGPLSWAM 
ANSG LQLLG Y FLALGGWVG I IASTALPQWKQSS YAGDAS IQLRS 
KVFVLESEWGGDSLGLPRDCGWSCLLHSAVRSEKGFWS 


7125 


166 


1127 


NCISEKRNYSFSMQKGKGRTSRIRRRKLCGSSESRGVNESHKSE 
FI EliRKWbKARKFQDSNIiAPACFPGTGRGIjMSQTSIjQEGQMI I S 
LPESCbLT\RDTVIRSYLGAYITKWKPPPSPLLAbCTFLVSEKH 
AGHRSLDEAXYLEILPKAYTCPVCLEPEVVNLLPKSLKAKAEEQ 
RAHVQEFFASSRDFFSSLQPLFAEAVDSIFSYSALLWAWCTVNT 
RAVYL\ SPGSGNAFLQSRTPVQLAP YLDLLNHS PHVQVKAAFNE 
ETHSYEIRTTSRWRKHEEVFrCYGPHDNQRIjFLEYGFVSVHNPH 
ACVYVSRGWNQLCS 


7126 


1 


733 


CRDMAAFIVPSPARRCSQKGSLGHLPTQPWLWAAMSPRGQBRGT 
SHSQAREPQRPGRWLLGSLQS S PGTLGQAGTASRRRGCMVQRWV 
QVATGRRAVQ V P KGALGLALGETS PGASRGMSGGAGGCWALG WA 
PSPVLPS WLLEGPPPWLS 1 1 SDSGTQRPSPRRCPARPSPWGPQC 
WRGGR IASAEAS S T* TPGSGSRARSGRRS PGSRRRSASAPSPTP 
PTDACA* SCVARPAGSRSSRPAAA 


7127 


1311 


277 


GLPAMCST*KAGYYEETEGDCIPKDR* IEKRPFKEI *RRIPRIF 
AKQKQI *S*NSQKIGASEIDRGRKEADCSDAPAAARIGAVSVFR 
RSTQEARVSPRSNAKSANIiRAVRAD*WEHFVLLFHTPEQFIiAEC 
ICRST* *K* WHQIiC*P]jSSL*TGI,KRKLLL*VLFRI * WLKDCDV 
* FCQKI FATNFCNWQNLI Q * EE * KPVE YSVEN* HIMNLLLPM * Ii 
CQSSLRDQTI VTWRM*RNYSMFRINMI SSL* DGS IHI PLKLHFY 
PALI FTLT VP I NS CCQRPLPLFAHQS I KTLAS SGS PMLACLRFL 
LVKKRAF IHTPRS PGCS V* CKHVLVKDNKNNCVGSEV 


7128 


2 


5228 


GRVDLWTILLGRSALREIiSQIBAELNKinflRRIjLEGIiSYYKPPSP 
SSAEKVKANKDVASPLKELGLRISKFLGLDEEQSVQLIiQCYLQE 
DYRGTRDS VKTVLQDERQSQALI LKI AD Y YYEE RTC I LRCVLHL 
LT YFQDERH PYRVE YADCVDKLEKELVS KYRQQ FEELYKTEAPT 
WETHGNLMTERQVSRWFVQCLREQSMLLEI I FLYYAYFEMAPSD 
LLVLTKMFKEOGFGSRQTNRHLVDETMDPFVDRIGYFSAIjILVE 
GMDI ESLHKCALDDRRELHQFAQDGLI CQDMDCLMLTFGD I PHH 

apvllawallrhti^peetssvwkiggtaiqlnvfqyltrllq 
slasggndcttstacmcvyglls fviits lelhtlgnqqdi i dta 
ce vladpslpelfwgteptsglgi ildsvcgmfphlls pllqll 
ralvsgkstakkvysfldkmsfynelykhkphdvishedgtlwr 
rqtpkllyplggqtnlri pqgtvgqvmlddraylvrwe ys yss w 

TLFTCE IEMLLHWSTAD VI QHCQRVKP I IDLVHKVI STDLS I A 
DCUjP I TS R I YMLLQRLTTVIS PP VDVI ASCVNCLTVLAARNPA 
KVWTDIiRHTGFIiPFVAHPVSSLSQMISAEGMNAGGYGISILLMNSE 
QPQGEYGVTI AFLRL I TTLVKGQIX5STQSQGLVPCVMFVLKEML 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L»=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R^Arginine, 
S=Serine, T= Threonine, V=t Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PSYHKWRYNSHGVREQIGCIiILELIHAILNliCHETDIiHSSHTPS 
LQFLCI CSIoAYTEAGQ WIKIMGlGViyriDlWMAAQPRSDGAEG 
QGQGQLLI KTVKLAFS VTNNV I RLKP PSNWS PLEQ AL SQHGAH 
GNNL IAVIiAKYI YHKHDPALPRLAI QLLKRLATVAPMS VYACLG 
NDAAA I RDAFLTRLQS K\ I E \ DMR I K\ VM 1 L\ E FLT VA \ VETQP 
GLIELFLNIiEVKDG\SDGSKEFSLGMW\SCLHAV/VWEIiIDSQQ 
QDRYWCPPLLHRAAIAFLHALWQDRRDSAMLVLRTKPKFWENLT 
SPLFGTLSPPSETSEPSILETCALIMKIICLEIYYWKGSLDQP 
LKDTLKKFS I E KRFAYWSGYVKS LAVHVAETEGSSCTS tiLE YQM 
Ij VS AWRMLIj 1 1 ATTHAD I MHLTDS WRRQL FLDVLDG TKALLL V 
PAS VNCLRLGSMK.CTLLL I LLRQWKRELGS VDE I LG PI/TE I LEG 
Vl^ADQQLMEKTKAKVFSAFITVLQMKEMKVSDI PQYSQLVLNV 
CETLQEEVIALFDQTRHSLALGSATEDKDSMETDDCSRSRHRDQ 
RIXWCVLGLHIJUCELCEVDEIXSDSWIjQ^ 

S LRMKQNLHFTEATLHLLLTLARTQQGATAVAGAGI TQS I CLPIi 
LSVYQLSTNGTAQTPSASRKSLDAPSWPGVYRLSMSLMEQLLKT 
LR YN F JjPEALD fvg vhqertlqclnavrtvqs LACLEEADHTVG 
F I LQLSNFMKEWHFHL PQLMRDI Q VNLG YLCQACTS FLHSRKML 
QHYLQNKNGDGLPSAV\AQRV\QRPPSAASAAPSSSKQPAADTE 
ASE(K3AriHTVQYGLLKILSKTI*AALRHFTPDVCQILLDQSLDLA 
E YNFLFALS FTTPTFDS EVAPS FGTLLATVNVALNMLGELDKKK 
E PI/TQAVGLS TQAEGTRTLKS IiIiMFTMENCF YIjI* I SQAMRYLRD 
PAVHPRDKQRMKQELSSELSTLLSSIiSRYFRRGAPSSPATGVLP 
S PQGKSTSLS KAS PESQEPL IQLVQAFVRHMQR 


7129 


1 


1054 


FRR FR WRRRLH *AGPASSAGGS PGEASGTMSGELPPNINIKEPR 
WDQSTFIGRANHFFTVTDPRNILIiTNEQLESARKIVHDYRQGIV 
PPGLTENELWRAKYIYDSAFHPDTGEKMILIGRMSAQVPMNMTI 
TGCMMTFYRTTPAVLFWQW1NQSFNAVVNYTNRSGDAPLTVNEI* 
GTAYVSATTGAVATALGLNAIjTKHVS PLI GRFVPFAAVAAANC I 
N I PLMRQRELKVG I PVTDENGNRLGESANAAKQAI TQVWSR 1 1* 
MAAPGMAI PP F I MNTLE KKAFLKRFP WMS AP IQVGLVGFCLVFA 
TPLCCAIiFPQKSSMSVTSLEAEIiQAKIQESHPELRRVYFNKGIi | 


7130 


2 


780 


HEVPSLQTSDPLPGSVQRCSVWSQPNKENWCQDHIiYNSLGRKG 
ISAKSQPYHRSQSSSSVLINKSMDSINYPSDVGKQQLLSLHRSS 
RCES HQDLLPDIADSHQQGTE KLS DLTIiQDSQKWWNRNLPLN 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELELSHNRRRKS 
DS KFVDADFSDNVCSGNTLHS LNS PRTPKKPVNS KLGIiS P YT/TP 
YNDS DKLND YLWRG PS PNQQN I VQSLREKFQ CLSS S S FA 


7131 


805 


573 


AAAEGHIEWKFIilEACJFCVNPFAKDRWGNIPLDDAVQFNHLEW"^ 
KLLQD YQDS YTI*S E TQAEAAAEALS KENLESMV 


7132 


1420 


1087 


IDMLLLS G ALVSGP YTJL ITTAVSADLGTHKSIiKGNAHAIjS TVTA 
I IDGTGSVGAALGPLIiAGLLS PSGWSNVFYMLMFADACALIjFLI 
RLIHKELSCPGSATGDQVPFKEQ 


7133 


2 


3648 


00 1 PGLLPAHGES GDALRKPRLQKP I TGHIiDDLFFTIj YPS LEKF 
EEEIiLEIiHV QDHFQEGGGPLDGGALE I LERRIjRVGVHNGLGFVQ 
RPQVVVLVPEMDVALTRSAS FSRKWSS S KTSSGSQAIjVLRS RI» 
RLPEMVGHPAFAVIFQLEYVFSSPAGVDGNAASVTSLSNIiACMH 
MVRWAVWNPLLEADSGRVTLPMGGIQPNPSHCI.VYKVPSASMS 
SEEVKQVESGTLRFQFSLGSEEHLDAPTEPVSGPKVERRPSRKP 
PTSPSSPPAPVPRVLAAPQNSPVGPGLSISQIAASPRSPTQHCL 
ARPTSQLPHGSQASPAQAQEFPLEAGISHLEADLSQTSLVLETS 
IAEQI^EIiPFTPIiHAPIWGTQTRSSAGQPSRASMVIiLQSSGFP 
EILDANKQPAEAVSATEPVTFNPQKEESDCLQSNEMVIjQFIiAFS 
RVAQDCRGTSWPKTVYFTFQFYRFPPATTPRLQLVQLDEAGQPS 
SGALTHILVPVSRDGTFDAGSPGFQLRYMVGPGFLKPGERRCFA 
R YIiAVQTLQI D VWDGDS LLL I GS AAVQMKHLLRQGRPAVQASHE 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








LEWATEYEQDNMWSGDMLGFGRVKPIGVHSWKGRLHLTLAN 
VGHPCEQKVRGCSTIiPPSRSRVISNDGASRFSGGSLLTTGSSRR 
KHWQAQKLADVDS E LAAMLLTHARQGKGPQDVSR ESDATRRRK 
LERMRSVRLQEAGGDLGRRGTSVLAQQSVRTQHLRDLQVIAAYR 
ERTKAESI ASLLSLAI TTEHTLHATLGVAEFFEFVLKNPHNTQH 
TVTVE I DNPELSVI VDS QEWRDFKGAAGLHTPVEEDMFHLRGSL 
APQLYLRPHETAHVPFKFQSFSAGQLAMVOASPGLSNEKGMDAV 
S PW KS SAVPTKHAKVLFRASGGKP I AVLCLTVELQPHWDQVFR 
FYHPELSFLKKAIRLPPWHTFPGAPVGMLGEDPPVHVRCSDPNV 
ICETQNVGPGEPRDI FLKVASGPSPEI KDFFVI I YSDRWIATPT 

PQELKTDPKGVFVLPPRGVQDLHVGVRPLRAGSRFVHLNIiVDVD 
CHQLVAS WLVCLCCRQPL I S KAFE I MI»AAGEGKGVNKR I TYTNP 
YPSRRTFHLHSDHPELLRFREDSFQVGGGETYTIGLQFAPSQRV 
GEEEIBIYINDHEDKNEEAFCVKVIYQ 


7134 


2115 


1111 


GGEGFS YP PHVGLS LGTPliDPHY V1.DE VHYDN PT YEEGLI DNSG 
LRLF YTMD I RKYDAGVI EAGLWVSLFHTI PPGMPEFQSEGHCTI* 
ECLEEALEAEKPSGIHVFAVLLHAHLAGRGIRLRHFRKGKEMKL 
LAYDDDFDFNFQEFQYLKEEQT1LPGDNLITECRYNTKDRAEMT 
WGGLSTRSEMCIiS YLLYYPRINLTRCAS I PDIMEQLQF IGVKEI 
YRPVTTWPFI I KS PKQ YKNLS FMDAMNKFKWTKKEGLS FN KLVXi 
S t»P VNVRCS KTDNAEWS IQGMTALPPDI ERPYKAEPLVCGTSSS 
SSLHRDFS INIiLVCLIiIiIiS CTLSTKSI* 


7X35 


2 


2072 


FVPRVTPRSLSLQGPKGESVGSITQPLPSSYLIFRAASESDGRC 
WliDALELALRCSS LLRLGTCKPGRDGE PGTS PDAS PSSLCGLPA 
SATVHPDQDLFPLNGSSLENDAFSDKSERENPEESDTETQDHSR 
j\x n&\3&iJ\j&a i ¥ijrAi?vKH\3 1 1 YVEQVQEELGELGEASQVETV5E 
ENKSLMWTLLKQLRPGMDLSRWLPTFVLEPRS FLNKLSDYYYH 
ADLIiSRAAVEEDAYSRMKLVLRWYLSGFYKKPKGIKKPYWPILG 
ETFRCCWFHPQTDSRTFYIAEQVSHHPPVSAFHVSNRKDGFCIS 
GSITAKSRFYGNSLSALiLnGKATT.TFT.MPaPn'VTT TWDvau/^vr* 

ILYGTMTLiELGGKVTI ecaknnfqaqlefklkpffggsts inqi 

SGKITSGEEVLASLSGHWDRDVFIKEEGSGSSALFWTPSGEVRR 
QRLRQHTVPLEEQTELESERU'JQHVTRAISKGDQHRATQEKFAL 
EEAQRQRARERQESLMPWKPOLFHLDPITOEWHYRYFnH^PwriP 

LKDIAQFEQDGILRTLQQEAVARQTTFLGSPGPRHERSGPDQRL 
RKASDQPSGHSQATESSGSTPESCPELSDEEQDGDFVPGGESPC 
PRCRKEARRLQALHEAILSIREAQQELHRHCSAMLSSTARAAQA 
PTPGLLQS PRSWFxjLCVFIiACQLFINHIIiK 


7136 


2 


418 


DFVPSFRRPSGNTSQTVWLLRAATLEKEVAGLREKIMHLDDMLK 
SQQRKVRQMIEQLQNS KAV I QS KDATI QELKEKIA YIJBAENBEM 
HDRMEHLI EKQISHGNFSTQARAKTENPGS IRISKPPSPKPMPV 
IRWET 


7137 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GS FKVATQERNPQRAQMRLRRQKKGWP FLGDFLTELQRLDSAI 
PDDLDGNTNKRS KEVRVLQEMQLLQVAAMNYRLRPLEKFV^YFT 
RMEQLSDKESYKLSCQLEPENP 


713 B 


2 


466 


WASGMS T VPGGSRHS LG I QVRGG WG VTGGEEES LTVP VADTV7QA 
GS FKVATQ ERNPQRAQMRLRRQKKG WP FLGDFLTELQRLDSAI 
PDDIJDGNTNKRSKEVRVI^EMQLLQVAAMNYRLRPLEKFVTYFT 
RMEQLSDKESYKLSCQI/EPENP 


7139 


1 


357 


S LRN S ARGLKMAASAARGAAAIiRRS INQ P VAF VRR I P WTAASSQ 
LKEHFAQFGHVRRCILPFDKETGFHRGLGWVQFSSEEGLRNALO 
QENHI I DGVKVQVHTRRPKLPQTSDDEKKDF 


7140 


1401 


1S57 


RASSI^VLKAWGGLIPSSFQQQHTGQYALEELFDLKVYDCFCSF 
NMOTSLEKQLRPSQPWPRGKCKKTPGWEEARPKAQDLRGDLGKT 
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ID 
NO: 


" reuicced 
beginning 
nucleotide 

locati r*»r» 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide - 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R^Arginine, ' 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAGPAEAHTRGPPRLPAATGCPPHLPGLLSGISVDIDPTGLQSQ 

WTPKGQDPPLMFSEDYQKSLLBQYHLGLDQKLRKyWGEjLIWNF 
ADFMTNQCG ! 


7141 
7142 


124 

CCD 


1073 


LDSRSCWLDMEDLEEDVRFIVDETLDFGGLSPSDSREEEDITVL 
VTPE KPLRRGLS HRS D PNAVAPAPQG VRLSLGPLS PEKLEE I LD 
EANRLAAQLEQCALQDRESAGEGLGPRRVKPSPRRETFVLKDSP 
VRDLLPTVNSLTRS TPS / LKQPDASTPE * * * EGVSQGS PGYI WK 
EALQHEEGVTHLQS VPCIQKPS I FSS \SRSTPPVRGRAGPSGRA 
AASEETRAAKXRGAAAKSSCQLPIPSAIPRPASRMPLTSRSVPP 

GRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQRIiNIjPVM 
GATRSNLQPP 


7143 


O O O 


839 


h I FLMLHME LiKMLS S VTbH I RAFL Y W I CLKPTSCL I FQNVIjNLL 
KK * S RAVG VWVMCRT/ YS SDLQVGVI KPWLLLGS QDAAHDLDT 
LKKNKVTHILNVAYGVENAFLSDFTYKSISILDLPETNII,SYFP 
ECFEFIEEAKRKDGWLVHCNA 


7144 


3 


773 


SLEMSSDGEPI.SRMDSEDSISSTIMDVDSTISSGRSTPAMMNGQ 
GSTTSSSKNIAYNCCWDQCQACFNSSPDIADHIRSIHVDGQRGG 
VFVCLWKGCKVYNTPSTSQSWLQRHMLTHSGDKPFKCWGGCNA 
SFASQGGLARHVPTHFSOX3NSSKVSSQPKAKEESPSKAGMNKRR 
i>. jjru^. trnur r vt\\i j. laja x KHKAZCFNLSAHIESLGKG 
HS WFHSTVSILLFFQIKYKTIiQKNISTI ISKShKI 




1 


9BB 


FRVNMQDGGPSPAEHSKAEESAGMEARFLGLPDAAGSSGPTPAR 
R CPAPRPAG VS YVI RDE VEK YNRNG VNALQLDPALNRIiFTAGRD 
S 1 1 RI WS VNQHKQDP YIASMBHHTDWVND I VLCCNG KTLI SASS 
DTTVKVWNAHKGFCMSTLRTHKDYVKAIAYAKDKELVASAGLDR 
QIFLWDVNTLTALTASNNTVTTSSLSGNKDS I YSLAMNQLGTI I 
VSGSTEKVLRWDPRTCAKLMKLKGHTDNVKAIiLNRDGTQCLS 
GSSDGTIRLWSI*GQQRCIATYRVHDEGVWALQVNDAFTHVYSGG 
RDRKIYCTDLRNPDIRVliICE 



TRADOCS: 14 !6260.1(%CSK0I I.DOC) 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO: 1-1 786 and 3573-5358, a mature protein coding portion 
of SEQ ID NO:l~1786 and 3573-5358, an active domain of SEQ ID NO:l-1786 and 
3573-5358, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one ofSEQ ID NO: i-1786and 3573-5358. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

1 2. An antibody directed against the polypeptide of claim 1 0. 

• 1 3. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 



1 4. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

1 6. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 1 0 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO: 1-1 786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO: 1-1 786 and 3573-5358, an active 
domain of SEQ ID NO:l-1786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO:l-1786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 



606 



BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



PCT/USOO/34263 



20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO:1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof 

21. The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO:l-1786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 



24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 



26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and apharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutical^ acceptable carrier. 
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